## Terminology

Swift, generic functions are separately compiled and type checked without knowledge of the concrete types at the call site. The interface between the caller and callee is defined by a list of generic parameters, and requirements imposed upon them.

Conformance requirement are special; if you impose a conformance requirement on a generic parameter, the associated types of the protocol become member types of the generic parameter. Those member types can have member types as well, and so on. Let's say that a "type parameter" is a "generic parameter or (recursive) member type thereof".

Inside a generic function, the type parameters are "abstract references" to the concrete types provided by the caller; since the generic function is type checked separately we don't know what those concrete types are, so the generic requirements of our function define their capabilities. The generics system implements a set of "generic signature queries" used by the rest of the type checker, to answer such questions as "does this type parameter conform to this protocol" or "are these two type parameters equivalent via same-type requirements".

At the call site, a list of concrete types are provided for each generic parameter, and they must satisfy these requirements. To satisfy a conformance requirement, the concrete type must conform to the protocol by providing a *witness* for each requirement. This includes the protocol's associated types; the "implementation" is in the form of a nested type, usually a type alias. An implementation of an associated type requirement is called a *type witness*.

If I call a generic function, like so,

```
func identity<T>(_ t: T) -> T { return t }
let x = identity(123)
```

The compiler must determine the type of `x`

. It does this via *type substitution*. Since the call site provides the concrete type `Int`

for the generic parameter `T`

, and the return type of `identity()`

is `T`

, we can compute the type of `x`

by replacing `T`

with `Int`

.

A more complex example:

```
func wrap<T>(_ t:T) -> Array<T> { return [t] }
let x = wrap(123)
```

Here, the type of `x`

is `Array<T>`

with the substitution `T := Int`

, which is `Array<Int>`

.

Things get more interesting with member types. Consider this:

```
struct Nat: IteratorProtocol {
// (can be omitted due to inference)
typealias Element = Int
var n: Int = 0
mutating func next() -> Int? {
defer { n += 1 }
return n
}
}
func second<I>(_ iter: inout I) -> Optional<I.Element> where I: IteratorProtocol {
_ = iter.next()
return iter.next()
}
var n = Nat()
let x = second(&n)
```

What is the type of `x`

? We're applying the substitution `T := Nat`

to `Optional<T.Element>`

. The member type `T.Element`

represents the type witness of `Element`

in the conformance of `T`

to `IteratorProtocol`

. In our case, `T`

is `Nat`

, and the type witness of `Element`

in this conformance is `Int`

. Thus, `T.Element`

is `Int`

, and our function's return type is `Optional<Int>`

.

You can have a more complex chain of member types where substitution has to look through several conformances to find the result. For example,

```
func f<T>(_ t: T) -> T.SubSequence.Iterator where T: Collection { ... }
let x = f("hello world")
```

The type of `x`

is `IndexingIterator<Substring>`

. We find it as follows:

- We're replacing
`T`

with`String`

.`T`

must conform to`Collection`

. -
`Collection`

defines a`SubSequence`

associated type. In our case, the type witness of`SubSequence`

in the`String: Collection`

conformance is`Substring`

. -
`Collection`

also requires that`SubSequence`

conform to`Collection`

. -
`Collection`

inherits from`Sequence`

, so`Substring`

must also conform to`Sequence`

. -
`Sequence`

defines an`Iterator`

associated type. In the`Substring: Sequence`

conformance, the type witness is`IndexingIterator<Substring>`

.

I won't do it here, but it's possible to write down a simple set of algebraic identities which explain type substitution in terms of a handful of primitive steps. You can find this in Chapters 7 and 8 of Compiling Swift Generics.

The question is, does a series of steps like the above always terminate?

In number theory there is a famous unsolved problem called the Collatz Conjecture. The problem can be stated as follows:

- Define a function
`f(n)`

, where`n`

is a positive integer. If`n`

is even,`f(n)`

is`n/2`

. If`n`

is odd,`f(n)`

is`3n + 1`

. Given an integer`n`

, we can form an infinite sequence`n, f(n), f(f(n)), f(f(f(n))), ...`

by repeatedly applying`f`

to the previous result. Does this sequence always contain the number 1?

For example, if you start with the number 3, then f(3) = 10, f(10) = 5, f(5) = 16, f(16) = 8, f(8) = 4, f(4) = 2, f(2) = 1.

You can compute this sequence with a tag system. Consider a string made up of `a`

, `b`

, and `c`

, and define an evaluation rule:

- If the string consists of the single character
`a`

, stop. - Otherwise, take the first character from the string:
- If the first character is
`a`

, append`bc`

to the end of the string. - If the first character is
`b`

, append`a`

to the end of the string. - If the first character is
`c`

, append`aaa`

to the end of the string.

- If the first character is
- Delete the first and second character from the string.
- Go back to step 1.

You can represent the input value as a string of `n`

repetitions of `a`

, so 3 is `aaa`

. If the tag system does not halt for some input string, then you've disproven the conjecture.

Now look at this protocol definition:

```
protocol P {
associatedtype A: P
associatedtype B: P
associatedtype C: P
associatedtype Pop: P
associatedtype Next
}
```

Let's define some concrete types that conform to this protocol.

```
struct Halt: P {
typealias A = Halt
typealias B = Halt
typealias C = Halt
typealias Pop = Halt
typealias Next = Halt
}
struct End: P {
typealias A = AA<End>
typealias B = BB<End>
typealias C = CC<End>
typealias Pop = Halt
typealias Next = Halt
}
struct AA<T: P>: P {
typealias A = AA<T.A>
typealias B = AA<T.B>
typealias C = AA<T.C>
typealias Pop = T
typealias Next = T.Pop.B.C.Next
}
struct BB<T: P>: P {
typealias A = BB<T.A>
typealias B = BB<T.B>
typealias C = BB<T.C>
typealias Pop = T
typealias Next = T.Pop.A.Next
}
struct CC<T: P>: P {
typealias A = CC<T.A>
typealias B = CC<T.B>
typealias C = CC<T.C>
typealias Pop = T
typealias Next = T.Pop.A.A.A.Next
}
```

Consider this code:

```
func addB<T: P>(_: T.Type) -> T.B.Type { return T.B.self }
let x = addB(End.self)
let y = addB(AA<End>.self)
let z = addB(AA<AA<End>>.self)
```

- The type of
`x`

is`End.B`

, which is`BB<End>`

. - The type of
`y`

is`AA<End>.B`

, which is`AA<End.B>`

, which is`AA<BB<End>>`

. - The type of
`z`

is`AA<AA<End>>.B`

, which is`AA<AA<End>.B>`

, which is`AA<AA<BB<End>>`

.

So if `AA<AA<AA<End>>>`

represents the string `aaa`

, then substituting `T`

with `AA<AA<AA<End>>>`

in `T.B`

produces the string `aaab`

, adding `b`

on the end. The same works with `A`

and `C`

. For example, substituting `T`

with `End`

in `T.B.C.A`

gives you `BB<CC<AA<End>>>`

.

Now, let's talk about `Pop`

. If I substitute `T`

with `BB<CC<AA<End>>>`

in `T.Pop`

, I get `CC<AA<End>>`

. So pop removes the first character of a string; `bca`

becomes `ca`

.

The `Next`

associated type is the tag system's evaluation rule. Look at the type witnesses of `Next`

in each of `AA`

, `BB`

, and `CC`

:

- In
`AA`

,`Next`

is defined as`T.Pop.B.C.Next`

. If`AA<T>`

represents the entire string, the generic parameter`T`

represents the rest of the string, not including the`a`

, so`T.Pop`

deletes the first two characters from the string. Then`T.B.C`

adds`b`

and`c`

at the end of the string. Then`T.B.C.Next`

invokes the evaluation rule again. - In
`BB`

,`Next`

is defined as`T.Pop.A.Next`

. Again, this removes the first two characters from the string, and adds`A`

at the end, then repeats the evaluation rule. - In
`CC`

,`Next`

is`T.Pop.A.A.A.Next`

, etc. - In
`End`

,`Next`

is`Halt`

.

Now, if you write something like `print(End.A.A.A.Next.self)`

, the compiler will evaluate this tag system. It will compute the substituted type and type check successfully after a few steps, and the type of this expression is just `Halt`

. But if you found a counterexample to the Collatz conjecture, then `print(End.A.A.....Next.self)`

with the right number of `A`

s would loop forever, since we would never reach the halt state.

So we cannot guarantee that type substitution terminates on all inputs, because in doing so we would have solved the Collatz conjecture---and that's the kind of result that gets your math paper written up in non-math blogs.

Perhaps that's not very compelling though. However, you can actually encode a Turing machine as a tag system, although the construction is more tedious. Why is that interesting? A Turing machine can express any computable function.

Suppose you had a function that checks if an arbitrary function terminates in a finite number of steps, without running it (ignore that Swift doesn't allow you to introspect closures like that; you can imagine the `fn`

parameter is a String or a BigInt instead):

```
func halts(_ fn: () -> ()) -> Bool { ... }
```

Then we can write a naughty function that halts if its input does not halt, and doesn't halt if its input halts:

```
func naughty(_ fn: () -> ()) {
if (halts(fn)) {
while true {}
}
}
```

Why is it naughty? Because then you can do this:

```
func absurd() {
naughty(absurd)
}
```

Does `absurd()`

halt? It calls `naughty`

with itself, and `naughty`

calls `halts(absurd)`

. If `halts(absurd)`

returns true, then `naughty`

loops forever. If `halts(absurd)`

returns false, then `naughty`

returns. In both cases, `halts`

gave us the wrong answer. The only possibility then is that `halts(absurd)`

runs forever, but this contradicts the assumption that `halts`

determines an answer in a finite number of steps.

Since any Turing machine can be encoded as a tag system, and any tag system can be encoded as a Swift type substitution, it is not possible to determine if Swift type substitution terminates.

If you're getting deja-vu, it's because three years ago I showed there is a Turing machine hidden in protocol `where`

clauses as well (Swift type checking is undecidable), and then fixed that (The "Requirement Machine", a new generics implementation based on term rewriting).

Last time though, I was looking for a new way of implementing generic signature queries and minimization to fix various known correctness and performance issues. Generic signature queries and minimization are basically solving various algebra problems in "closed form", so given the undecidability, the old formalism was just wrong to accept any arbitrary protocol definitions as input, because it pretended to give an answer every time. The new implementation solved this by restricting the range of possible programs to those whose protocol definitions can be represented with a convergent string rewrite system.

This second Turing machine though, is not as problematic, I think. Type substitution really is defined as a series of deterministic evaluation rules, and the current implementation is fine from a practical standpoint. So I don't think there's any huge reason to overhaul it to introduce a termination guarantee, since in light of the above result, this would necessarily involve rejecting some previously-valid programs as well.

It would be nice though, to eventually diagnose the simpler cycles here. Occasionally people accidentally write some variant of this and get confused:

```
protocol P {
associatedtype A: P
}
struct G<T: P>: P {
typealias A = T.A.A
}
struct X: P {
typealias A = G<X>
}
// compile-time hang/crash
print(G<X>.A.self)
```

We could impose a maximum iteration count, or catch the obvious recursion here, or something, to avoid the bad UX of a compiler crash.