Allow for Compile-Time Checked Intervals for Parameters Expecting Literal Values

Introduction

Certainly methods are written to primarily accept literal values. However, when using types such as Int or Double there is no way to check at compile time that the supplied arguments are valid for a given interval/range. We must rely on run time checks that could cause for adverse side effects which may be difficult to handle.

Motivation

Commonly method parameters are written to only allow for certain valid literal values. With finite options an enum can be used to limit and constraint these options. However, enums are not always a great solution. Imagine the method below:

/// Set the height of a building.
/// Only numbers 1 through 10 (inclusive) are valid.
///
/// - Parameter height: The new building height with interval [1, 10]
func setHeight(_ height: Int) {
    guard height > 0, height <= 10 else {
        // What do we do here?
        //
        // Maybe return false?
        // Maybe throw an error?
        // Maybe change the return type to Optional<Bool>?
    }

    // ...
}

While an enum could be created to allow for all of the values (1 through 10), if the method allowed for a height between 1 and 1,000 that quickly becomes unwieldy and an abuse of enums. The code as written will not catch any errors at compile time, only run time, even though we are looking for a specific set of valid literals. So there is no reasonable way to guard against a programming error in this regard.

Proposed Solution

NOTE: This solution is only useful for methods that take literal parameters. Of course if a non-literal variable is expected then that can only be checked at run time (for example, if height was supplied by the user then there is no way to check for the interval at compile time of course).

This solution adds a way to construct a method to take literal values within an interval. The method displayed before can now be written as:

/// Set the height of a building.
/// Only numbers 1 through 10 (inclusive) are valid.
///
/// - Parameter height: The new building height with interval [1, 10]
func setHeight(_ height: Int[1...10]) -> Bool {
    // We can check for the height at compile time now instead of run time
}

The above method will require that height be a literal or constant value; a variable cannot be passed in (a compile time error would have to occur in this case).

Regular Swift range syntax would also apply here: [1..<5] would allow for the numbers 1, 2, 3, and 4 (since 5 in this case is not inclusive). And for more than just Ints: [0.0...5.0] would allow for anything between 0.0 and 5.0 (inclusive).

4 Likes

I think what you're pitching is a much reduced form of Dependent Types. This is also a feature I would love to see in swift and find myself reaching for all the time.

There has been some previous discussion on the topic, see:

Syntactically I think it would make more sense to write the value constraints using existing notation. Something like this comes to mind:

typealias Height = Int where self > 0, self <= 10

This is also related to previous discussion on newtype functionality, see here.

3 Likes

Better validation at compile time would benefit many types, not just those expressible by integer literals. The groundwork is being laid for this with compile-time constant expressions.

Once built out, one imagines that a @compilerEvaluable precondition would get you the diagnostics you want.

See also:

13 Likes

+1 to @xwu's point. IMO this is best modeled by having the compiler evaluable work trace through normal asserts, producing a static compile time error when it can prove that a call will always fail. This would also be useful for a number of metaprogramming related things.

15 Likes

So this would be the newtype idea, except the method that filters out illegal values needs to be constexpr whenever possible?

Dependent Types in some form or fashion is something I've been interested in being added to Swift for a while, so glad to see some more interest/movement on this!

Having more comprehensive compile-time checks for things like 0 division, passing percentages, or passing degrees seems like the ideal way to prevent those types of user errors.

In addition to the literal checking, it would be great if the compiler could check whether a variable has been verified as being valid before the function was called. e.g. (pseudo code)

setHeight(height) // error because the variable wasn't checked
if [1...10].contains(height) {
    setHeight(height) // no error because the variable was checked.
}

I think it makes sense to start with a more generic compile-time-check API like @xwu mentioned, and then later add some syntactic sugar once there's a good idea what types of things are often being checked.
1 Like

a good place to start would be defining a wrapper type Digit around Int that can only be initialized from literals in the range 1 ... 10. Right now you can do this with a run-time check though a compile-time check would be better. Then you should implement +, -, *, /, %, etc to trap if they would produce an out-of-range result. This is heavy and unwieldy, but the only sound implementation. I do think though that we could automate this at least partially,, for example we could synthesize such a wrapper using @rauhul ’s `syntax

typealias Height = Int where self > 0, self <= 10

and let it get implicitly upcast to its backing type instead of needing to drill down into a .value member. in fact, this is sounding suspiciously like a property wrapper, and could probably be implemented as such, if the variable you’re trying to constrain is bound to a type scope.

It shouldn't use "typealias." The reason is in its name; the facility does not create a new type that is a 100% copy of another, it just references an existing type. (AIUI, creating a functions with signatures that differ by switching a type with an alias would be an illegal redeclaration, not an overload.) But adding constraints to an existing type must fundamentally create a new distinct (albeit related) type. So using the "typealias" word for this capability would be confusing (a.k.a. lying) to the user.

The syntax reminds me of Ada's subtype syntax. I've been looking at a manual for research, but Ada has a different type philosophy that makes the subtype syntax work for it but would be a bad fit for us. (Ada has many built-in types, with sets of operations, and subtyping built in. Your first type declaration declares a type class and its initial no-constraint subtype. We moved those types totally within the Standard library, enabled with the struct/enum/class facilities, so we need to handle subtypes/dependent-type/reduced-state-types differently.)

1 Like

TL;DR

Protocols express the requirements to be something. Types express the notion that a thing is something. Constraints (constexpr or otherwise) express the requirements to be used for something. Depending on what you want in any specific situation, you choose the appropriate of those three mechanisms. Examples like the aforementioned ‘Height’ one tend indeed to want for [constexpr] constraints.

Reasons

It’s interesting (and complicating) that what this is trying to express is not just typy nor just protocolly. In some sense you’re just trying to say “I want an Int that meets these requirements” - which sounds protocolly, but then when you think about it, does it make sense to have a Height protocol in the traditional sense, i.e. where technically it can be any type, not just an Int? In a pure Protocol world that could make sense, since Int would merely be a protocol that expresses the idea of a thing which has the set of properties & capabilities that distinguish an integer. But that’s nowhere near what Swift is today, and seems obviously not a place Swift will ever go.

If you nonetheless try pragmatically to just add the ability to require specific typeness to protocols, e.g.:

protocol Height: Int where self > 0, self < 10 {}

Or:

protocol Height where Self is Int, self > 0, self < 10 {}

There’s all sorts of awkward things about that. It blurs the lines between types & protocols, for one. You haven’t actually conformed Int to the protocol either, yet - you’d have to add the extension to Int as well, which is a lot of redundancy & boilerplate. And in principle, since it’s a protocol, you can put all the normal protocol stuff between the curly braces, which… well, probably that’s actually useful in some cases, but it doesn’t negate the prior concerns and might amount to feature creep.

So that suggests this should be a bit more typy, e.g.:

subtype Height is Int where self > 0, self < 10

That seems much cleaner at the outset (even though it still blurs typiness & protocolness a bit).

A new keyword & associated concept would be required, as opposed to repurposing typealias or similar, because of the need to support any type, not just classes, as well as the fact that this is not merely an alias (as @CTMacUser pointed out). I think it’s also best to be distinct, syntactically, vs subclassing - i.e. not go down the route of allowing:

struct Foo {
    ...
}

struct SubFoo: Foo where self > 0

It’s too close to subclassing, and thus people will forever be misguided into wanting support for all the rest of subclassing’s capabilities (new properties, new functions, overrides, etc), which don’t make sense by definition for a constraining subtype (nor value types in general in the presence of the extension keyword & mechanism).

Alternatively there is a ‘wrapper’ approach, where you declare e.g. ‘Height’ as a new type which stores an Int internally and applies the conformances through constexpr constraints specifically on its init method(s), but that’s an impractical amount of boilerplate for any non-trivial base type, e.g.::

struct Height {
    private var actualValue

    init(_ value: Int) where value > 0, value < 10 {
        self.actualValue = value
    }
}

// And now a bajillion extensions to make Height conform to all the protocols
// expected of an Int, assuming there even are protocols to fully cover what
// an Int is (hint: there aren’t, today)...

// ...and you still have to do “myHeight.asInt” or somesuch everywhere you need
// to use virtually any existing code, since `Int` is a type, not a protocol.  This
// approach is very hostile to the extensions paradigm.

So, given all that a subtype keyword et al seems like the best approach (at least vs protocols or wrappers).

Then, several obvious questions are:

Do subtypes manifest as new Metatypes?

My intuition is that they should, for several reasons:

  • Code that calls type(of: ...) and the like, and certainly anything that conveys the type of the thing to a human, would be more intuitive to call it a Height than an Int. It would also be a valuable distinction in reasoning about code using generics or compound types, since while technically an e.g. array of Heights and an array of Ints would be equivalent at runtime, semantically they are distinct.

  • Being able to runtime conditionalise behaviour based on compile-time ensured and stated compliance to a subtype could be useful just as it can be useful to downcast class instances. e.g.:

    subtype EvenInt is Int where self % 2 == 0
    
    func halve(_ number: Int) -> Int {
        if number is EvenInt {
            // Fast path.
            return number >> 1
        } else {
            // Some other, more complicated code path to handle Ints that might not divide evenly.
        }
    }
    

    That example is deliberately “contrived” on first glance, to illustrate a subpoint - yes, the ‘slow path’ still has to handle all the cases anyway, and you could get this specific example’s same behaviour in principle by just writing if number % 2 == 0 directly (or maybe with generics and specialisation that way), but having it presented via typing:

    • Makes the code’s intent clearer just as using types & protocols in general do.
    • Is a convenient shorthand for what amounts to a constexpr that can be applied rigorously and consistently across a codebase (though I guess we’ll see how constexprs ultimately get implemented, w.r.t. whether they end up with an equivalently convenient stamping mechanism anyway).
    • Is more flexible or at least concise than generic specialisation, in many cases (e.g. where it’s a two hundred line method of which only one line changes depending on the true type of the argument(s)).

    However, this does again highlight how from a lot of angles this feels very protocolly.

  • You can then more sensibly require explicit upcasting (i.e. requiring someHeight as Int whenever you pass the value to a parameter expecting an Int), because the true type would be preserved under the covers, allowing for downcasting back to the leaf type at any later stage (if needed).

    Iff, of course, it’s actually possible to track all that at runtime (my presumption is that value types don’t have any kind of ‘isa’ attribute under the covers, and class types do, which is presumably how Swift actually does downcasting today…?). And it’s not a given that such functionality is required - though IMO it’s very nice, in a type system, for it to dissuade ‘escapes’ (the careless or unintentional loss of typing information, especially in a way that propagates that loss forwards, as implicit conversions do).

    It’s not a logical requirement that implicit upcasting be disallowed, but it would be consistent with the premise that these subtypes are semantically distinct enough to warrant dedicated syntax for them. It also fits well enough with (IMO) Swift’s intent that functionality be explicitly attached to intended types, not ‘floating’ - since an extensions to Int would still apply to Height.

    That also leads to the next obvious big question...

What does <some Height> + <some Int> do?

The resulting type cannot unconditionally meet the requirements, since the addition could produce a value that’s outside the requirements of Height. A reasonable default would therefore be to pick the least specific type, i.e. Int in this case, but that means we’ve added (back) into Swift implicit type conversion, essentially (granted it’s an upcast, not a lateral cast, but still).

But furthermore consider, instead of an operator, what about a method on Int that returns Int - what does that do when called on an Int that is specifically a Height? Does it still return an Int? It would have to - we can’t unilaterally guarantee that the result of all methods on Int that return Int (or Self if expressed that way) would actually return a value conforming to Height. And it would be a bit absurd to have to do (myHeight as Int).someMethodOrProperty for all such methods - you should just use the ‘wrapper’ pattern at that point, without any new language support needed.

And any mechanism that would require Height to override every such method in order to ensure the returned values are also Heights is impractical if not impossible - for example, how would it handle a negated method sanely when the result is naturally not a valid Height, but the method on Int is not declared with throws nor returns Optional or the like. Etc.

So logically any method on the parent type:

  • Cannot implicitly return the subtype (not even via Self or any other such genericness).
  • Must nonetheless be supported on instances of the subtype without any extra boilerplate.

So at that point, what’s the point? Anything you do to a Height will turn it “back” into a raw Int, and that tiny remaining functionality - of semantic distinction in passing values around unmodified - can easily be achieved using a wrapper type.

Alternatively, if the compiler were willing to sort of ‘dynamically’ change the method signatures - e.g. automatically make throwable or the return value Optional any methods which aren’t provably going to return a a valid Height - then I guess that could work… but that’d be a pretty complicated task for the compiler (and might require constexpr and similar advancements first, anyway)...

Conclusion (?)

...all of which seems to lead to the apparent conclusion that this should not be implemented by actual subtyping (nor protocols), but instead ‘merely’ via constraints on values at their individual use sites (e.g. via constexprs).

What we then need is a way to declare a ‘macro’ for those constraints, to get comparable convenience & benefits to the subtype approach, e.g.:

constexpr func IsValidHeight(_ value: Int) -> Bool {
    return value > 0 && value < 10
}

...

func someFunc(_ value: Int) where IsValidHeight(value) {
    ...
}

Though naturally you still want to be able to apply the same constraints to stored values’ types, and now we discover an infinite loop since the apparent way to do that would be something like:

<some keyword> Height = Int where IsValidHeight(self)

That’s really just a convenient ‘macro’ for, at every stored value declaration:

var height: Int where IsValidHeight(height)

And then if you think about it that way, maybe typealias really is essentially all that’s needed (given constexprs)…?

Addendum

In any case, what actually interests me - beyond just ‘plain’ constexprs - is the idea that the Swift compiler could automatically de-try and de-Optional based on parameter analysis & the function’s implementation, where appropriate (i.e. not across boundaries that are supposed to represent stable APIs). This wouldn’t be a replacement for any of the above necessarily, but could be complimentary, e.g.:

func setHeight(_ value: Int) throws {
    guard let height = value where value > 0 && value << 10 else {
        throw SomeError
    }

    ...
}

...

setHeight(5)  // Look ma, no ‘try’!

Swift knows the argument ‘5’ can’t cause an exception by static analysis, so it allows the ‘try’ to be elided. That’s the ‘carrot’ side - the ‘stick’ side could be similarly enhanced vs today:

try! setHeight(11)  // Compile-time error:  Swift knows it will always throw and therefore always abort.
try? setHeight(11)  // Compile-time warning:  Swift knows it will always throw.
// The try? case could even detect that `setHeight` has no side-effects, omit code for it
// entirely, plus warn that the statement has literally no effect.

To me that seems like a much bigger win in practice (even if it only works within a single module, as my intuition suggests, without more effort from the compiler and/or programmer).

Then, adding constexpr to the mix generalises this fully.

2 Likes

I like this syntax because it makes it more generic (except the wrongly used typealias keyword) and would be in favour of getting us closer to the newtype ideas.

How about:

constrainedtype Height = Int where self > 0, self <= 10

In use:

var height: Height

func setHeight(_ value: Height) {
  …
}

It's perhaps just syntactic sugar for:

var height: Int where height > 0, height <= 10

func setHeight(_ value: Int) where value > 0, value <= 10 {
  …
}

Seems like a fairly simple but rewarding bit of sugar.

What can you do with this new type ? I guess it won't be possible to even use basic operator like +

var h1: Height = 8
var h2: Height = 8
var h3: Height = h1 + h2
2 Likes

See also my prior post where I talked about that.

In this case - as opposed to Height truly being a distinct subtype - it's all Int underneath, so there's technically no ambiguity or complication about what the + operator works on or returns - all still Ints. It's only the assignment to h3 that's questionable, as that's when the constraints on the Height type are evaluated and need to be followed. If the compiler cannot verify those constraints at compile-time, it would have to disallow the assignment.

Alternatively, the compiler might - instead of erroring out unconditionally - implicitly demote any such unclear constraint checks to runtime by turning the operation into a throwing one, thus requiring the code author to use try! / try? / etc to handle the possible constraint failures at runtime.

So the code would have to be:

var h1: Height = 8
var h2: Height = 8

guard try var h3: Height = h1 + h2 else {
  …
}

Or something to that effect.

For mutating operators it'd have to be something like:

var h1: Height = 8
try {
  h1 += 2
} catch ConstraintViolation {
  …
}

Etc.

Integer overflows trap in Swift

I don't think this is relevant here; I think Height is a constrained form of Int, but not a distinct type and anything that would trap on Int should fail in the same ways on Height.


This does not feel like the right use of try. If the compiler cannot statically prove that the result of a function does not meet the constraints of the type its being assigned to be, it should emit an error.


@Jean-Daniel's example should be an error. Please ignore the actual error descriptions, I'm sure someone else can find a better way to describe them.

let h1: Height = 8
let h2: Height = 8
let h3: Height = h1 + h2 // The result of `h1 + h2` does not satisfy the requirements of `Height`.

In the case where the compiler cannot statically determine if the expression's result would meet Heights constraints it should emit an error kinda like this:

func (h1: Height, h2: Height) -> Height {
    let h3: Height = h1 + h2 // Cannot statically prove the result of `h1 + h2` satisfies the requirements of `Height`, please insert a run-time check.
    return h3
}

With this error in mind the user could change their code to:

func (h1: Height, h2: Height) -> Height {
    let result: Int = h1 + h2
    // result: 0...20
    guard result <= 10 else { return 10 }
    // result: 0...10    
    return result
    // returning result is fine here because it meet all requirements on `Height`
}

What matters here is that the user knows they need to find a way to tell the compiler the result of their expression meets the (typealias/subtype/newtype)'s constraints.

The behavior is inherent to the operator, not the type. When + overflows, it traps.

It should not. When + overflows, it traps--those are the semantics of the operator.

In some sense that’s an orthogonal debate. For the purposes of this thread, either approach is viable.

As someone that writes code that has non-zero deployment latency and that is used by someone other than myself, my code not crashing is pretty much a necessity. Anything that traps or aborts is a horrible design flaw or simple bug to me.

“Give up and explode” is a rational strategy only in extreme cases where there’s a significant probability that program integrity truly has been compromised. That used to be pervasive in prior C-family languages in large part because so many operations were memory unsafe, but Swift makes those much rarer. Overflow of an addition cannot possibly justify unconditionally killing the current process. That Swift does this today is a design “flaw” IMO (though only for lack of a known better approach…). And not even handled gracefully by Apple’s own tools - e.g. the following code crashes the Swift Playground iOS app:

var a: UInt8 = 255
var b: UInt8 = 2
a + b

I think the trapping behaviour was purely because nobody wants to have to write try for every integer operation et al, which is a totally reasonable compromise. In practice it’s not usually a huge deal because integers are pretty big containers and overflow is rare. But once we start adding constraints that reduce their scales massively - as in this Height example where the ‘int_max’ becomes essentially just ten, overflow - if not otherwise caught - would be literally a million times more likely.

Thus why it’s completely unacceptable to treat constraint violations as overflows, at least insofar as they cause unconditional process abort.

Beyond that, I always prefer that the compiler not require complete compile-time assurances wherever possible (e.g. where it doesn’t impose unreasonable burden, as it seemingly would with basic integer operators), because my suspicion is that will make life very difficult for code authors just like any hard compile-time requirements are. Swift as a language clearly prefers to to defer problem handling to runtime, via exceptions, where necessary.

It also allows the code author to make the choice, rather than forcing it on them, since they can always just use try! if “give up and explode” is a viable strategy for them.

Or put another way, I prefer a little bit of Python-style “well, I’ll let you try, good luck” vs e.g. C++ or Rust’s “you’re going to sit here and fight with me for six hours until I’m satisfied, then you can have your app”.

How many time have you considered what will append in case of overflow while design an algorithm using interger ?

The answer is easy, almost nobody think about that, that why there is a lot of security vulnerability caused by integer overflow, and so why it should trap by default.

Again, that behavior is inherent to the semantics of +, which are not changeable at this point. You can have an operator that exhibits a different behavior, but it cannot then be spelled +.

Can you elaborate on why these arithmetic operators have to trap if one of these type constraints is violated?