[Pitch] Dependent Types & Universes (Stage 1 of Proof-Driven Development?)

toph42 · March 17, 2023, 1:23pm

I've frequently seen @Clamped as an example for result builders. What is it this solves that doesn't? I suspect the answer is somewhere in the realm of compiler enforcement?

Max_Desiatov · March 17, 2023, 1:36pm

Not just numbers, but any value. Another use case for this is a type-safe state machine with state transitions validated at compile time. Translating a corresponding code sample from Idris to hypothetical Dependent Swift, it would probably look like this:

enum DoorState {
  case closed
  case open
}

enum Action<Initial: DoorState, Final: DoorState> {
  case makeOpen -> Action<.closed, .open>
  case makeClosed -> Action<.open, .closed>
}

enum ActionSequence<First: DoorState, Second: DoorState> {
  case chain<Initial, Intermediate, Final>(
    Action<Initial, Intermediate>, 
    ActionSequence<Intermediate, Final>
  ) -> ActionSequence<Initial, Final>

  // allows expressing end of the sequence in the chain
  case end<Final>() -> ActionSequence<Final, Final>
}

This way you can never construct ActionSequence that applies two .makeOpen actions in a row, by definition. Chains of invalid state transitions will just trigger a type checker error. The compiler will guarantee that makeOpen is always followed by makeClosed and vice versa. For two states this could seem trivial, but imagine any other state machine with a larger number of states, say a networking protocol. Ability to encode more of the protocol's spec in the type system is quite powerful.

Max_Desiatov · March 17, 2023, 1:47pm

Correct. @Clamped as usually pitched is a run-time check. With dependent types it could become a compile-time check.

Slava_Pestov · March 17, 2023, 1:48pm

I’d encourage you to read, eg, the Idris manual (it’s not very long) which explains things better than we could: The Idris Tutorial — Idris 1.3.3 documentation

I personally don’t want to see dependent products and sums in Swift, because of the implementation and notational complexity. But there’s a lot of interesting stuff out there that’s worth learning about for its own sake.

GreatApe · March 17, 2023, 2:12pm

This also looks like something that could be done with existing generics though, something like:

protocol DoorState { }

enum DoorStateOpen: DoorState { }
enum DoorStateClosed: DoorState { }

struct Action<Initial: DoorState, Final: DoorState> {
    static var makeOpen: Action<DoorStateClosed, DoorStateOpen> { .init() }
    static var makeClosed: Action<DoorStateOpen, DoorStateClosed> { .init() }
}

func chain<Initial: DoorState, Intermediate: DoorState, Final: DoorState>(
    action: Action<Initial, Intermediate>,
    sequence: ActionSequence<Intermediate, Final>
) -> ActionSequence<Initial, Final> {
    fatalError()
}

Seems very similar to the composition example.

To be clear, dependent types sound very interesting, just can't figure out exactly what it is that we would need to add, what kind of change it actually is.

Max_Desiatov · March 17, 2023, 2:16pm

How do you pattern match against chain in switch statements with your code if chain is a function and not a case of an enum? Also, how do you define ActionSequence? It's mentioned as a generic type in your code, but how do you express it with the existing generics system? Is it a struct? Is it an enum? How do you pattern match against all possible states in a switch statement if DoorState is a protocol and not an enum?

GreatApe · March 17, 2023, 2:18pm

I don't quite understand your example, it confuses me that the cases return things. Are they actually functions?

Max_Desiatov · March 17, 2023, 2:42pm

In Swift enum cases already return things. For many intents and purposes they do look like static functions. You just don't need to provide a body for such function. This is perfectly valid Swift 5.7:

enum Expression {
  case bool(Bool)
  case number(Int)
}

let caseFunction: (Int) -> Expression = Expression.number

print(caseFunction(42))
// prints `number(42)`

The main limitation is that cases of generic enums always return Self with an unapplied generic argument:

enum Expression<T> {
  case bool(Bool)
  case number(Int)
  case sum(Expression<Int>, Expression<Int>)
  case and(Expression<Bool>, Expression<Bool>)
}

// this makes sense
let b = Expression<Bool>.bool(true)

// fine too
let n = Expression<Int>.number(42)

// wat
let nonsense = Expression<Int>.bool(false)

// these should be caught as errors by the type checker, but nope
let absurd = Expression<Bool>.number(24)
let moreNonsense = Expression<Int>.and(.number(42), .number(24))
let ohWell = Expression<Bool>.sum(.bool(true), .bool(false))

For the type checker to catch invalid expressions we need to be able to specify return types for cases, which is impossible with Swift as it is right now:

enum Expression<T> {
  case bool(Bool) -> Expression<Bool>
  case number(Int) -> Expression<Int>
  case sum(Expression<Int>, Expression<Int>) -> Expression<Int>
  case and(Expression<Bool>, Expression<Bool>) -> Expression<Bool>
}

This feature in other languages is called Generalized Abstract Data Types (GADT) and is most probably a prerequisite for ergonomic dependent types. This is what allows both Action and ActionSequence in the state machine example above to be enums with a fixed number of cases that you can match against in switch statements, like for any other enum.

GreatApe · March 17, 2023, 3:02pm

I am aware of how enums work. But then you meant something like this:

protocol DoorState { }

enum DoorStateOpen: DoorState { }
enum DoorStateClosed: DoorState { }

struct Action<Initial: DoorState, Final: DoorState> { }

extension Action<DoorStateClosed, DoorStateOpen> {
    static func makeOpen() -> Action { fatalError() }
}

extension Action<DoorStateOpen, DoorStateClosed> {
    static func makeClosed() -> Action { fatalError() }
}

struct ActionSequence<First: DoorState, Second: DoorState> {
    static func chain<Initial, Intermediate, Final>(
        first: Action<Initial, Intermediate>,
        second: Action<Intermediate, Final>
    ) -> ActionSequence<Initial, Final> { fatalError() }


    static func end<Final>() -> ActionSequence<Final, Final> { fatalError() }
}

func test() {
    let open: Action = .makeOpen()
    let close: Action = .makeClosed()

    // fail
    let seq0: ActionSequence = .chain(first: open, second: open)
    // ok
    let seq1: ActionSequence = .chain(first: open, second: close)
}

Max_Desiatov · March 17, 2023, 3:53pm

That code snippet doesn't answer these previously posed questions that arise without GADTs and dependent types:

How do you exhaustively enumerate all states when matching against them in a switch statement? I.e. how do you get a compiler error if you forgot to specify one of the states as a switch branch?
How do you iterate over ActionSequence if it's a struct and chain/end are functions declared on it?

GreatApe · March 17, 2023, 4:03pm

I don't know, I'm not sure what your goal is with the example, so it's hard to say. But in general I'm not arguing against dependent types, I think it sounds awesome, just trying to understand what it actually means.

Are you saying that iteration is one part of it, the difference between what we have already and what dependent types are?

Max_Desiatov · March 17, 2023, 4:10pm

Being able to express these concepts as enums and getting type safety with exhaustiveness checking at compile time is the main difference. You may be able to work around the lack of GADTs and dependent types with protocols and similar hacks, but it will require significantly more code and won't be as type-safe and readable.

OTOH, finding a good way to integrate it with the existing Swift type system and not breaking anything, while not making developer experience worse at the same time is a challenge. I think GADTs on their own are much more lightweight than dependent types, and still move the language in the right direction IMO. Maybe a better approach is to get a consensus on GADTs first, and if that's successful it could get more people interested in more advanced topics like dependent types.

Joe_Groff · March 17, 2023, 5:10pm

Now that we have a generalized existential representation inside the compiler, as well as implicit opening, it should be possible to implement GADTs using those mechanisms together to represent enum cases that capture type information. @griotspeak had proposed what I think is a nice syntax for them a while back:

indirect enum Expr<T> {
  case literalInteger(Int) where T == Int
  case literalString(String) where T == String
  case add(Expr<Int>, Expr<Int>) where T == Int
  case concat(Expr<String>, Expr<String>) where T == String
  case embed<U>(swiftValue: U, eval: (U) -> Expr<T>) where U: Sendable
}

codafi · March 17, 2023, 5:36pm

No pun intended, but it depends. The simplest way would be to construct a data type whose inhabitants are all even integers by construction.

In Agda you could spell such a type over the naturals as

data isEven : ℕ → Type where
  zero : isEven 0
  plusTwo: (n : ℕ) → isEven n → isEven (succ (succ (n)))

In pseudo-Swift this would be

@dependent enum IsEven<var n: Int> {
  case zero : IsEven<0>
  case plusTwo(n: Int, IsEven<n>) : IsEven<n + 2>
}

Note the embedding of values into type bounds and the GADTs holding this whole thing together. You don't need GADTs to spell this, it just becomes significantly more annoying as you now have to write a function that constructs this proof.

But I also want to comment on the original pitch here. Yes, dependent types are very cool. I even spent some time trying to hack together a dependently-typed programming language that used Agda's syntax and Swift's runtime type metadata scheme. But in the process, I came to see the enormous complexities involved in even that system. Here are some notable open questions I would like to see answered by anybody looking to add dependent types to Swift:

How Powerful is the System You're Proposing?

There's lots of different grades of dependent types systems. How much generalization do you want to imbue the system with? Can you get away with first-order dependent types, or do you need higher-order systems? Do you actually want GADTs or do you want to truly collapse the barrier between types and values? Can you get away with something like type-level predicates instead?

What is the underlying notion of Equality?

This is the fundamental question driving type theorists and their research. You may want to read The Meaning of Meaning by Hilary Putnam if you haven't already for a crash course in the philosophical implications of intensionality and extensionality. Similarly, I highly recommend reading Thomas Streicher's seminal thesis for a thorough grounding in intensional type theory - it's really quite good and he actually takes the time to write down a lot of the proofs you see elided in the literature. I'm also personally a fan of McBride and Altenkirch's Observational Equality paper even if it has been superseded for the most part by folks shooting for proper extensional equalities.

The notion of equality you pick determines the vast majority of the features you're able to propose from the proof assistant angle. Moreover, it directly determines the implementation of the system in the compiler. From a user perspective, in systems with intensional equalities one spends an awful, awful amount of their programming time justifying the structure of well-typed ASTs to the compiler. I mean, this what it takes to prove to Agda that (x * (1 + y) = x + x * y) which is a fact I would expect any student with a basic grounding in algebra to be able to demonstrate, but which I now have to spoon-feed to the compiler. To my mind, the thing that makes that kind of exercise bearable is that the authors of these languages tend to make their tooling highly interactive, and can often automatically search for well-typed programs that inhabit user-defined types and goals. In earlier extensional systems, one would have to trade such a flexible and powerful equality with either losing sight of the computational content of their proofs or losing other desirable program properties like equality being generally decidable (in the presence of other safeguards like well-foundedness or termination checking or whatever syntactic "shrinking" notion).

Ultimately, I don't think Swift is the right medium to prove out this featureset. What I would love to see instead is if we were to reach out to the communities behind this work and pinch some of the features of their proof assistants, not necessarily their languages. Like, consider the idea of runtime type holes in Haskell. In most modern proof assistants - which have type systems significantly more powerful than vanilla Haskell - you can similarly open "holes" in the middle of your developments and you can work hand-in-hand with the compiler to interactively solve those holes. We have an strikingly similar idea in Xcode today with placeholder tokens, what we're missing is the interactivity and the compilers not stressing so hard about always having tokens filled in. Imagine if you could enter a mode where you could compile those missing program pieces in and run your code. When your program encounters the hole, it could trap to the editor and you could work with Xcode to insert the missing piece, back up, and retry the program's execution. This is the kind of workflow that researchers have been using for decades in these kinds of proof assistants, and that most programmers in industry could only dream of!

If anybody would like a brutalist introduction to Agda, I have formalized an old course taught by McBride in Agda and have answers to most of the exercises available.

Max_Desiatov · March 17, 2023, 8:41pm

Isn't this the same workflow one gets in Swift with fatalError if it's used as an implementation "hole"? Or is there an implied requirement for hot reloading just one function when its implementation hole is replaced with valid code?

codafi · March 17, 2023, 8:43pm

It's an extension of that workflow. For one, a hole doesn't necessarily have to be a hard trap. For another, holes in proof assistants are places you can place your cursor and start a conversation with your tools. In Agda, one can ask 'what type of value ought to go here?', or 'what do I have in scope so I can build a value of this type?', or 'I know what type constructor I want to use here, can you refine this hole for me?'. I can't ask Xcode to do any of those things for me with a fatalError.

wowbagger · March 18, 2023, 1:21am

For people who might be confused by some of the concepts in this thread, I think the Software Foundations series from UPenn should serve as a good introduction.

filip-sakel · March 18, 2023, 10:03am

I’m not sure if we’re discussing specific implementations yet, but if we were to implement value constraints, we’d need to consider overload resolution. Here’s an example with constraints (which aren’t practical but highlight the problem of overload resolution).

struct Array<Element, let count: Int>
  // ...
 
  subscript(position: Int) where position >= 0 { get }

  subscript(position: Int) where position < count { get }
}

let a: Array<Int, 1> = [1]
a[0] // Which overload do we choose?

sighoya · March 18, 2023, 2:06pm

Would be a general compile error for unknown values.

But overload resolution in Swift isn't perfect anyways.

sspringer · March 19, 2023, 6:57am

Just as a reference: I mentioned this topic in the topic Dreaming of a “harmless” language mode just created by me.