Two more undecidable problems in the Swift type system

Slava_Pestov · May 4, 2023, 2:59pm

Previously:

Conditional conformances

The first construction uses SE-0143. (Rust also has conditional conformances, and here is a writeup with a similar construction, except it uses an interpreter for the Smallf%#$* programming language, in place of a tag system: Rust's type system is Turing complete. Also, here is a bug report where a user inadvertedly wrote an infinite loop with conditional conformances in Swift: [SR-6724] Swift 4.1 crash when using conditional conformance · Issue #49273 · apple/swift · GitHub. I've been thinking about this problem for a while now.)

Consider an expression involving a qualified name lookup like foo.bar. If foo conditionally conforms to some protocol Q, and Q defines a method named bar, then the result of the name lookup depends on this conditional conformance. Is name lookup a decidable problem in the Swift type system?

You start with an encoding of a tag system from the type substitution example, but without the Next associated type. In the type substitution post, the Next associated type encoded the computation; but the below construction shows that you doesn't need that and it is possible to encode the computation in other ways.

protocol P {
  associatedtype A: P
  associatedtype B: P
  associatedtype C: P
  associatedtype T: P
}

struct Halt: P {
  typealias A = Halt
  typealias B = Halt
  typealias C = Halt
  typealias T = Halt
}

struct End: P {
  typealias A = AA<End>
  typealias B = BB<End>
  typealias C = CC<End>
  typealias T = Halt
}

struct AA<T: P>: P {
  typealias A = AA<T.A>
  typealias B = AA<T.B>
  typealias C = AA<T.C>
}

struct BB<T: P>: P {
  typealias A = BB<T.A>
  typealias B = BB<T.B>
  typealias C = BB<T.C>
}

struct CC<T: P>: P {
  typealias A = CC<T.A>
  typealias B = CC<T.B>
  typealias C = CC<T.C>
}

So far, everything is decidable. Here, the protocol P is pretty simple and the conformances all have very simple type substitution rules.

But then, you add another protocol Q and declare some conditional conformances to model the tag system evaluation:

protocol Q {}

extension Halt: Q {}
extension End: Q {}
extension AA: Q where T.T.B.C: Q {}
extension BB: Q where T.T.C: Q {}
extension CC: Q where T.T.A.A.A: Q {}

func foo<T: Q>(_: T.Type) {}

foo(AA<AA<AA<End>>>.self)

Now, to check that AA<AA<AA<End>>> conforms to Q, you have to evaluate the tag system:

aaa => abc => cbc => caaa => aaaaa => aaabc => abcbc
=> cbcbc => cbcaaa => caaaaaa => aaaaaaaa => aaaaaabc
=> aaaabcbc => aabcbcbc => bcbcbcbc => bcbcbca
=> bcbcaa => bcaaa => aaaa => aabc => bcbc => bca
=> aa => bc => a => halt

This reduces to Halt: Q, which conforms unconditionally.

Interestingly, if you actually try the above example, it fails with a circular reference. There might be a bug there because the Collatz sequence for n=3 does not actually have a non-trivial cycle (if it did for any n, that would disprove the conjecture!).

SIL type lowering

As part of constructing SIL from the syntax tree, the compiler must determine calling conventions by flattening out value types into atomic leaves.

Consider these modified definitions of AA, BB and CC:

struct End: P {
  typealias A = AA<End>
  typealias B = BB<End>
  typealias C = CC<End>
  typealias T = Halt

  var t: Halt
}

struct AA<T: P>: P {
  typealias A = AA<T.A>
  typealias B = AA<T.B>
  typealias C = AA<T.C>

  var t: T.T.B.C
}

struct BB<T: P>: P {
  typealias A = BB<T.A>
  typealias B = BB<T.B>
  typealias C = BB<T.C>

  var t: T.T.A
}

struct CC<T: P>: P {
  typealias A = CC<T.A>
  typealias B = CC<T.B>
  typealias C = CC<T.C>

  var t: T.T.A.A.A
}

Now, to compute the lowering for a type like this, the compiler also has to evaluate the Collatz sequence for n=3, because you get a series of nested fields t, t.t, t.t.t, ... until eventually one of the t.t.t.t...t.t.t's is 'Halt', which is an empty struct.

struct S {
  var t: AA<AA<AA<End>>>
}

Now, the point is not that the termination of these algorithms hinges on the Collatz conjecture---any tag system can be encoded this way, and tag systems are Turing complete. A universal Turing machine would require a tag system built out of 578 concrete types, which would be unweildy, but they would be mechanically constructed in the same way as the AA/BB/CC types for the Collatz sequence.

It's interesting that all four examples I've constructed so far depend on SE-0157, basically pushing the limits of the interactions between this feature and different parts of the compiler. I wonder if there are any hidden Turing machines that do not depend on this feature.

In the fullness of time, we could convert conditional conformance checking and SIL type lowering into iterative algorithms which could impose a maximum iteration count, and instaed of crashing and blowing the stack they could emit breadcrumb diagnostics showing the sequence of substitutions which took place.

bbrk24 · May 15, 2023, 11:38pm

I believe I have found an error in your encoding. According to Wikipedia, the rules are a -> bc, b -> a, and c -> aaa. However, the extension you have written has b -> c, not b -> a. With this minor change:

extension BB: Q where T.T.A: Q {}

The code does compile.

Slava_Pestov · May 15, 2023, 11:51pm

Indeed you’re right, nice catch! Thank you.

tera · May 16, 2023, 1:09pm

What is "Smallf%#$* programming language" ?

scanon · May 16, 2023, 3:16pm

A fairly minimal Turing-complete language with a profane name: Smallf%#$