[Pre-Pitch] Runtime Type Guarantees

typesanitizer · November 17, 2019, 4:19am

I don't follow. The OP wants both code reuse and performance. If you go with option 1, code reuse is affected. If you go with option 2, performance is affected (the proposal wants static checking instead of runtime checking).

TellowKrinkle · November 17, 2019, 4:47am

I mean the runtime check for emptiness is going to have to happen somewhere (assuming the array contents aren't known at compile time), this just moves it from the object use side to the object creation side with the hope that you will create/modify objects less than you use them (often true). Unless you know of a way to guarantee that an array stays nonempty on modification without checking if it's nonempty.
I think in general the goal of attaching preconditions to objects is to put more work on the side of object creation in return for less work when using the objects, which pays off when you use objects multiple times, but I don't think it's possible to reduce the cost of modification. Someone has to check whether that integer is nonzero whether it's the person making it or the person using it.
As for the lvalue option, it prevents possibly surprising people when they crash from making modifications that could break preconditions, forcing them to acknowledge it by recreating a new NonEmpty or whatever from the result and force unwrapping it to indicate their intent. The minus is that it will seem horribly unnecessary for other mutating methods that will always succeed, like .append.

Jon_Hull · November 17, 2019, 5:08am

What I think can be done in this case is to allow mutable methods which have been defined specifically for that modified (refinement) type. So you would be able to use .append right on the variable, but you would not be allowed to use .remove without putting it in a variable of the appropriate type.

I believe this can all still be done at compile time, with a few explicit runtime checks where necessary as before.

typesanitizer · November 17, 2019, 5:08am

The problem is that with the runtime check on modification scheme, every modification is penalized! This is bad, because we probably want to modify things quite a bit for performance (consider pushing elements into a preallocated array). The newtype approach moves the checking to creation time instead of modification time, which avoids this drawback. [Maybe I am misunderstanding your point, but from my POV, the subtyping approach seems to require checking on modification, whereas the newtype approach does checking on creation.]

As for the lvalue option, it prevents possibly surprising people when they crash from making modifications that could break preconditions, forcing them to acknowledge it by recreating a new NonEmpty or whatever from the result and force unwrapping it to indicate their intent. The minus is that it will seem horribly unnecessary for other mutating methods that will always succeed, like .append .

I don't think the lvalue option is a feasible one. I wrote the (x as Array<String>) out explicitly to make it clear that there was an upcast. Instead, if NonEmptyArray<String> was a subtype of Array<String>, the user will expect the compiler to do that upcast implicitly, no question about it.

var x: NonEmptyArray<String> = ["Hello"]
x.remove(at: 0)

If that fails to compile, how can one claim that NonEmptyArray<String> is indeed a subtype of Array<String>? It is not, because you aren't able to use a method of the super type.

Jon_Hull · November 17, 2019, 5:21am

Think of it more like a variable which has been declared with let. You can't call mutating methods on it, but it is still of the type and can be passed to variables defined with var and mutated there. (Note: In our case, we could also define some mutating functions which are explicitly allowed if we want)

Make sense?

TellowKrinkle · November 17, 2019, 6:33am

If your newtype allows modification then it will have to check on modification. If it doesn't then it just turns modification into a creation (thus requiring the check). Think of a newtype that doesn't allow modification like the lvalue-only subtype, which would have the same properties. If you mean taking the value out, modifying it multiple times, and putting it back in, you can do that with both a newtype and subtype (for subtype, typecast out, modify multiple times, typecast back). From what I can tell, newtype vs subtype is just a matter of how much sugar you want to throw into the compiler, and has no impact on performance.

typesanitizer · November 17, 2019, 7:19am

If your newtype allows modification then it will have to check on modification.

No. A newtype can choose to do checks on modification or skip them -- it very much depends on the invariants you're trying to uphold. Consider the two functions append and remove in the newtype approach.

append can be a mutating method on NonEmptyArray<T> that doesn't do any check that after appending the element, the new array is still non-empty.
remove can either be
- a mutating method on NonemptyArray<T> which performs a check before removal, error if check failed, remove if check passed
- a free function that takes a NonemptyArray<T> and returns an Array<T> without doing any checks

With the newtype approach, the API author is in control of where the checks should be put, since the API author knows which functions/methods can preserve invariants without doing any checks.

With the subtyping approach, in the general case, the compiler cannot know whether a function is going to violate the type's invariants, so it needs to be pessimistic and insert checks. In the append vs remove example, this means that the compiler needs to insert checks for both append and remove because it does not know which of these mutating methods will violate the type's invariants.

Jon_Hull · November 17, 2019, 7:25am

Please read what I am suggesting in the comments above again.

typesanitizer · November 17, 2019, 8:11am

What I think can be done in this case is to allow mutable methods which have been defined specifically for that modified (refinement) type. So you would be able to use .append right on the variable, but you would not be allowed to use .remove without putting it in a variable of the appropriate type.

That is not how subtyping works. If NonemptyArray<T> is a subtype of Array<T>, then I should be able to call any and every instance method of the supertype on instances of the subtype, whether they are mutating or nonmutating. Not allowing that breaks the Liskov substitution principle (otherwise known as the L in SOLID). That is not an acceptable limitation.

I believe this can all still be done at compile time, with a few explicit runtime checks where necessary as before.

@anandabits has rightly pointed out that doing such checks at compile time requires sophisticated type system machinery. Liquid Haskell works as well as it does because Haskell as a language avoids subclassing nearly everywhere else, with the exception of function types. Moreover, the verification system heavily relies on using an SMT solver to do the heavy lifting, and the resulting type errors are often cryptic. Your proposal does not address any of the complexity that such a system would require, especially its interplay with language features like ABI stability, overloading, subclassing, untracked side-effects and the like.

Think of it more like a variable which has been declared with let . You can't call mutating methods on it, but it is still of the type [emphasis mine] and can be passed to variables defined with var and mutated there. (Note: In our case, we could also define some mutating functions which are explicitly allowed if we want)

As I said before, specifically the part I have highlighted, violates the Liskov substitution principle.

Please read what I am suggesting in the comments above again

The implication that I did not understand your comment is uncalled for. If you feel your point hasn't been understood, please point out where the misunderstanding is and try rephrasing your explanation and/or give examples that make things clearer.

GetSwifty · November 17, 2019, 8:36am

This kind of reminds me of Dependent Types. I wonder if what you're proposing could be accomplished with something similar.

e.g.with 0 division

extension Float {
    static func /(_ numerator: Float, _ denominator: Float) -> Float where denominator != 0{
        // Can do division without 0 check because it's guaranteed not to be 0
    }
}
let num = 5
var den = 0
let result = num / den // ERROR: Float./ requires denominator to not be 0

den = 2
let result = num / den // legal because the compiler can guarantee den won't be 0

den -= 2 
if den != 0 {
    let result = num / den // legal because the condition was checked
}

Or the aforementioned Array.first

extension Array where \.count >= 1 {
    // doesn't need a range check because it's guaranteed to not be empty
    var first: Element { self[0] }
}

let myArray = ["foo"]
let item: String = myArray.first // not optional because compiler can give guarantee non-empty

var myArray = ["foo"]
myArray.remove(at: 0)
let item: String? = itemArray.first // is Optional because myArray may be empty

Advantages:

It gives compile time guarantees for runtime values
It allows optimization when safe while still enforcing that safety otherwise
It can be integrated/added to the existing Types. No new Types are required
Assuming compiler support, it should be possible to to use everywhere: Type, extension, variable, function, etc.
It's quite flexible. I can see it being used for things as simple as guaranteeing an Array's count, or as complex as guaranteeing User authentication

Disadvantages:

My guess is a lot of work would be needed for the compiler to support such a feature. I'm no compiler engineer but it seems like quite a complex feature that would require a lot of design and performance iteration.
It would add significant work for the compiler to calculate and check these guarantees, so would have a negative impact on compile time
It would add complexity in an, IMO, non-intuitive way. E.g. checking a variable could change the function called and the Type returned (the Array.first Example)

More broadly I'm unsure if Dependent Types really fit well in Swift, but it's a language feature I find really interesting so figured I'd bring it up

Jon_Hull · November 17, 2019, 9:08am

I think this is the misunderstanding. The only runtime checks I am proposing are in the form of is and as?. They would all be explicitly written by the end programmer. None of that would be performed by the compiler.

The compiler would stop you from using any mutating functions that aren't explicitly defined for the refinement though. That is a much simpler problem though. It only has to look to see what has been defined.

Does let x violate LSP? It is a constraint on the variable, not a different type. This is a subtle point, but those methods still exist, and are recognized... the compiler just constrains you from calling them from that particular variable (same as for let x). You can still shove it anywhere which has the unconstrained type without issue.

A constrained variable can always be copied into an unconstrained one (and when we have movable types, it will be movable in the same way):

let x: NonEmpty Array<String> = ["Hello"]
let y: Array<String> = x //Always allowed

To move the other way, you have to do the runtime check using as?:

let x: Array<String> = ["Hello"]
guard let y = x as? NonEmpty Array<String> else {...}

I didn't mean to offend, nor did I mean to imply that you didn't understand... just that I wanted you to actually read what I had written, since I feel like it answers your objections, and I felt like you had written your response before reading what I had written. I was trying to point that out and phrased it badly. Apologies.

You do seem to be adding a bunch of complexity which isn't there (Maybe it needs to be there, and we can discuss that, but it isn't what I am proposing).

cukr · November 17, 2019, 9:52am

What about types with reference semantics? You cannot trivially copy them (and if your response is to allow it only for structs, then remember that structs also can have reference semantics)

michelf · November 17, 2019, 12:43pm

While that would work for preserving the invariants, would it result in a usable type? I guess Array.append is would work, but what else? For instance, in what ways are you allowed to mutate Positive Int?

I think by disallowing all mutations except the vetted ones you'll end up with very crippled types unless you take a lot of time to rewrite the whole interface for each type you want the guarantee to apply to. That'd make the feature difficult to use. It's also worth noting that the more code you have to write to make it work the more likely there'll be slip-ups that could result in breaking the guaranties.

Swift's typical approach is to handle overflow by trapping. If you have a var x: Positive Int = 1 and then do x += -2 it seems reasonable to me that it'd trap. It'd be nice if the compiler could tell you that at compile time when possible while doing constant propagation, but disallowing += outright because it might make the integer negative seems too drastic. I think the same could be said of array.remove(0) (you might not be removing the last element). With your approach whether you trap or make the function unavailable at compile time is an editorial decision that is likely to change depending on who writes the facade. Facades that change the underlying implementation also make it complicated to combine guaranties together.

What I proposed earlier was a way to simply introduce refinements for signatures of the type's methods without changing any of the implementation. This introduces no risk of making the type behave differently in a reimplementation, which in turn significantly reduces the risk and effort needed to create new kinds of guaranties. The optimizer would be in charge of leveraging those guaranties to elide redundant checks when possible. Perhaps some functions could be annotated to skip those invariant checks (when you know they'll always preserve the invariant) or to make them outright unavailable (when they always break the invariant), or to make them evaluated only in debug builds, but there would be no rush to annotate them all (and risk doing it wrong) when creating a new guarantee.

Edit: Example of what I was thinking:

decorator Positive: Int {
   invariant {
      self >= 0
   }

   // introduces a refinement when both are known positive integers
   // invariant that skip the evaluation of the invariant
   refine(skipinvariant)
   static func += (lhs: inout Positive Int, rhs: Positive Int)

   // known positive creates known negative, can skip invariant check
   refine(skipinvariant)
   prefix static func - (operand: Positive Int) -> Negative Int
}

typesanitizer · November 17, 2019, 3:26pm

I think this is the misunderstanding. The only runtime checks I am proposing are in the form of is and as? . They would all be explicitly written by the end programmer. None of that would be performed by the compiler

Ok, understood. Thanks for clarifying that.

Does let x violate LSP? It is a constraint on the variable, not a different type. This is a subtle point, but those methods still exist, and are recognized... the compiler just constrains you from calling them from that particular variable (same as for let x ).

No, it doesn't. LSP allows one to do a like-for-like substitution. let doesn't violate that because the requirement is not that "you should be able to replace vars with lets" (which certainly will not work) as that is not a like-for-like substitution.

One phrasing of the Liskov substitution principle is that:

If S is a subtype of T, then objects of type T in a program may be replaced with objects of type S without altering any of the desirable properties of that program

If we try applying to the following example:

protocol Wellformed { func ok() -> Bool }
extension Array: Wellformed { func ok() -> Bool { return true } }
extension NonemptyArray: Wellformed { func ok() -> Bool { return self.count != 0 } }
var x: Array<Int> = [1]
x.remove(at: 0)
assert(x.ok())

then we have S = NonemptyArray<Int>, T = Array<Int>, and there are two desirable properties here:

that it is, indeed, a program -- if it type-checked before, replacing the object of type Array<T> with an object of NonemptyArray<T> should continue to type-check
that the program runs successfully -- if the program is allowed to type-check, this property is not upheld, because we will hit the assertion

You do seem to be adding a bunch of complexity which isn't there (Maybe it needs to be there, and we can discuss that, but it isn't what I am proposing).

Maybe it is because we have different notions of what a user expects from the language , and perhaps that is why I am getting ahead of myself. I think that if a user has to write

if myVar is Positive Int { .. }

they would not like that. Instead, they would want to write:

if myVar > 0 { .. }

and have that still work. Making that work is a tough challenge.

typesanitizer · November 17, 2019, 3:44pm

I think this post is mixing up dependent types with refinement types (see Dependent types vs Refinement types). All the examples use a flow-sensitive type-checking which is characteristic of refinement type systems. In a dependently typed setting like Idris or Coq, NonZeroFloat would be a different type altogether and you would need to write conversion functions between Float and NonZeroFloat.

Jon_Hull · November 17, 2019, 4:19pm

In this case, you would probably want to allow:

func += (lhs: inout Positive Int, rhs: Positive Int {...}

but not:

func += (lhs: inout Positive Int, rhs: Int) {...}

Note that all of the normal arithmetic (e.g. +,-,*,/) is just fine because it creates a new value instead of mutating. There are very few mutating members of Int in general, so most of the API would be available by default.

I do get your point that for some types (e.g. Arrays), it would be annoying to go through and provide all the useful mutable functions. The only thing I can say is that that is what libraries are for. We will use the libraries/types much more often than we write them.

Jon_Hull · November 17, 2019, 4:25pm

I'm mainly thinking of value types here, but the modification is of the variable, so it should work for both reference and value types.

There is one fairly large issue with Swift Classes though in that they don't have a concept of mutating functions... thus we would have to assume that all functions are potentially mutating. It isn't a show stopper, but it does make them much more annoying to work with, since you would basically have to provide all of the API you want to be available. Maybe we could come up with a syntax or annotation that eases this burden by marking functions as safe for the invariant?

cukr · November 17, 2019, 4:44pm

I'm not sure what you mean by that. Maybe if I give you an example things would get clearer. Let's use a pointer, which is an archetypical reference-semantics thing.

    let a: UnsafeMutablePointer<Int> = ...
    a.pointee = 123
    guard let b = a as? Positive UnsafeMutablePointer<Int> else { fatalError() }
    print(b.pointee > 0) // true
    let c: UnsafeMutablePointer<Int> = b
    c.pointee = -456
    print(b.pointee > 0) // ???

What would happen if someone wrote code like that? Would the last line print false even though b is supposed to hold a positive Int?

Jon_Hull · November 17, 2019, 4:52pm

Oh, I see what you are saying now. Yes, that is an issue.

GetSwifty · November 17, 2019, 8:55pm

Seems like a different in an implementation detail in an already obscure feature...I guess I don't really care what you call it