Replacing the Type Checker

KeithBauerANZ · May 7, 2025, 4:40am

Closures & method/property syntax are the obvious problem. Let's say we hand-wave some problems away:

no more ad-hoc overloads
introduce multiparameter typeclasses to handle operators
create unidirectional inference similar to Hindley-Milner

Now consider

struct MyStruct {
    var property: Int
}

func withMyStruct<R>(_ operation: (MyStruct) -> R) -> R {
    operation(MyStruct(property: 3))
}

// and a call to it
let x = withMyStruct { $0.property }

This is well-typed in current Swift — x is Int and $0 is MyStruct.

With unidirectional inference, we should start at $0.property, and we're already in trouble — we have no idea what type $0 is. If we're in Haskell, the code looks like property $0 and we're fine — property :: MyStruct -> Int is a global function, we can just look it up. With member syntax, we need to have a type declaration of the receiver before we can resolve any member references. So we're forced to rewrite:

let x = withMyStruct { (s: MyStruct) in s.property }

(Instantly wiping out all closure argument shorthands including $0 and untyped x, y in).

With that out of the way, the rest of the expression typechecks here — the closure is inferred as (MyStruct) -> Int, R is inferred as Int, and x is inferred as Int. But if we instead write:

let x: UInt64 = withMyStruct { (s: MyStruct) in 3 }

we're back in problem territory. Once again, in Haskell we're OK — the closure is inferred as Num a => MyStruct -> a, the call to withMyStruct is inferred as Num a => a and it's fine to bind that to x :: UInt64. But in Swift, we don't have the concept of the "most general type" (in fact, closures must have concrete types†), and we'd be forced to infer the closure as (MyStruct) -> Int, R as Int, and then error on the binding to x. This might be surmountable by inferring the closure expression as (MyStruct) -> some ExpressibleAsIntegerLiteral, then concretizing in a second (but still polynomial) pass back down the tree, and erroring only if we're left with a non-concrete closure type.

You could almost consider a world where the additional pass back down is the solution to the first problem too — infer { $0.property } as something like @_structural protocol π0 { associatedtype τ0; var property: τ0 { get } }; (some π0) -> π0.τ0. Then the call to withStruct would match π0 against MyStruct successfully, and substitute π0.τ0=Int, and infer the call to withMyStruct as Int. What I don't know is whether one pass "up" and another "down" wouldn't just fail in a slightly more complex situation, whether @_structural protocol could reasonably be created for the full complexity of method calls, etc.

† except the mysterious & magical _openExistential(:do:)…

awwright · May 7, 2025, 7:12pm

I wouldn't say I've run into many problems I can attribute to the type system paradigm, as opposed to bugs and missed optimizations. For example, I just made a typo with this code:

let separator: PatternType? = switch(self.op) {
	case 0x2A: nil;
	case 0x23: Matcher<Symbol>.toClosedRang(); // sic
	default: fatalError()
}

Somehow the error this produced was:

Cannot convert value of type 'ExpressibleByNilLiteral' to specified type 'PatternType'"

... There is no universe where nil is the problem. The problem is that "toClosedRang" isn't even a symbol that appears anywhere in the code, the compiler should have seen this and immediately concluded "there is nothing that this can possibly map to, flag an error and bail."

I also get this when trying to use flatMap, apparently there's a version where the closure is allowed to return nil? And it always assumes, for some reason, that's the signature I'm trying to conform to. How do I disable that variant entirely?

This sounds like an attempt at a partitioning optimization. Even if you have cases like A + A -> B, as long as the other cases don't use the same data types, like C + D -> C then there should be no issue, the two operators serve entirely separate partitions of types. By looking at the first data type in a sequence of +, you know only B is a possible result (and not C or D).

awwright · May 7, 2025, 7:26pm

And before taking a serious effort at a new type system, there's a couple features I think would be useful:

When the code is unambiguous, the ability to select a function/closure/variable and "Add type signature"
The ability to specify types in terms of other type declarations, e.g.:

let rootFoo: SomeType<Dictionary<Int, String>>
let childrenOfFoo: Array<rootFoo.Type> // Trivially resolves to Array<SomeType<Dictionary<Int, String>>>

This would strike a nice balance between annotating concrete types and brevity.

bdkjones · May 7, 2025, 8:24pm

When the code is unambiguous, the ability to select a function/closure/variable and "Add type signature"

This kinda exists. You can option-click to get a popover with the declaration and copy the type out of it. But that popover will be empty if the compiler can’t type-check the entire expression, which makes it essentially worthless.

The exact time you need to see the popover with docs and declaration is when you’re trying to remember, “What was the second parameter name, again?” And you don’t have 12 minutes to wait for Xcode’s awful code completion to kick in with the wrong answer generated via AI.

theundergroundsorcerer · May 7, 2025, 8:56pm

@Slava_Pestov
May I ask you another a few more questions?

What is the right mental model for closures? Are they to be considered as "polymorphic function literals" and that's the reason why they break so often? Or it's more than that?
How are expressions parsed? Are they converted to functions first?
For example, what's the reason that the compiler can handle a code like

let x: Double = 1 + 1 + ... + 1 + 1

with relatively sufficiently long sequence of + without breaking in a sweat (it does struggle when more operators are introduced) but crashes when trying to compile something like this?

let x: [Double] = [] + [] + [] + [] + [] + [] + []

with the following error message:

error: emit-module command failed with exit code 1 (use -v to see invocation)
main.swift:1:5: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions

1 | let x: [Double] = [] + [] + [] + [] + [] + [] + []
  |     `- error: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions
2 |

error: emit-module command failed with exit code 1 (use -v to see invocation)
main.swift:1:5: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions

1 | let x: [Double] = [1] + [1] + [1] + [1] + [1] + [1] + [1]
  |     `- error: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions
2 |

Also, why does the same error appears twice (this happens a lot during compiler errors - they get repeated too often)

Also, for example, this code still fails.

let x: [Double] = [1] as [Double] + [1] + [1] + [1] + [1]

should not it get converted to something like

Add(Add(Add([1] as [Double], 1), 1), 1)

(I've made it shorter for brevity purposes) which should be reasonable easily to resolve to [Double])?

(But if [1] gets replaced with [] it does get resolved pretty easily, no matter how long the expression is).

theundergroundsorcerer · May 7, 2025, 11:10pm

@Slava_Pestov
Are we sure there aren't some low hanging fruits to pick? The compiler produces
error: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions even when there is no place for type inference - such as typos, missing symbols (for example attempts to call a function that wasn't imported), conversions that don't make sense for example ([1, 2, 3] as Float), wrong number of arguments and so on. Perhaps some soundness check can be performed before inferring types - for example checking that there are no typos, making the obvious type substitutions and so on. Reporting typos or undefined symbols or obvious conversions that don't make sense as "expression was too complex" is misleading - and it is arguably a (serious) bug in the compiler, as it is clearly misleading and has nothing to do with the expression being complicated.
Is solving this problem harder than it seems? Because if it is not - it might improve the quality of life for many people. Perhaps it is something worth looking into.

PS: For some reason, even when it detects missing symbols (which happens most of the time), it still spends unreasonable time on trying to detect types instead of failing quickly and reporting missing symbols etc.

bdkjones · May 8, 2025, 1:20am

Is solving this problem harder than it seems? Because if it is not - it might improve the quality of life for many people.

This. More than any esoteric feature addition to Swift, shoring up the fundamental performance of the compiler would make the biggest (HUGE!) improvement for developers.

Swift needs a Snow Leopard release: no new features, tons of performance improvement in the SwiftUI-related tooling. (Where by “performance” I mean speed and accuracy/reliability of diagnostics.)

I performed a clean build of an app that’s about 1,500 files comprised of 50% Objective-C and 50% Swift (with zero SwiftUI) yesterday and it finished faster than an incremental build of my pure-SwiftUI app that’s maybe 50 files. I’m so used to hitting build in Xcode and then switching to Safari to read the news while I wait, the notification that the build was finished made my head spin.

I miss those days. They should make a comeback.

Lancelotbronner · May 8, 2025, 3:12pm

I have no idea what I'm taking about but I've started trying to experiment with a simplified Swift AST to experiment type-checking.

My idea is to go from branch to leaves, building constraints (sort of like Environment in SwiftUI) and binding anonymous parameters ($0, $1, ...) as we go with two distinct operations: partial-check and check, potentially repeated with different constraints.

Here's how that would play out in my head with some of the examples here:

struct MyStruct {
    var property: Int
}

func withMyStruct<R>(_ operation: (MyStruct) -> R) -> R {
    operation(MyStruct(property: 3))
}

// and a call to it
let x = withMyStruct { $0.property }

Would type-check the let statement as such:

let x = withMyStruct
// lookup in scope, find all withMyStruct overloads, there's only one so that's easy
{
// our single overload has a closure parameter so we're good, we bind $0 to the MyStruct parameter
$0
// we know from our environment that its MyStruct
.property
// we know from MyStruct that property is Int
}
// implicit return means we return the type of the expression which was Int
x
// is Int

And for the literal example making use of partial checks:

/* username, password, address: String, channel: Int */
let url = "http://" + username
// binary operation means (explicit-)arity=2
// partial-check the operands:
"http://"
// resolves to a placeholder "string-literal" type
username
// lookup in scope, found to be string
// we look for all + overloads with (explicit-)arity=2, `lhs` conforms to ExpressibleByStringLiteral and `rhs` is String
// we find a single `(String,String)->String` overload so we're certain this is a string
+ ":"
// we know that `lhs` is `String` and `rhs` conforms to ExpressibleByStringLiteral, resolve a single overload and type the expression String

// ...
+ channel
// we know that `lhs` is String and `rhs` is Int, lookup overloads for that, find none, mark as error

awwright · May 8, 2025, 4:25pm

I use this often, but it doesn't always work since it's part of the documentation generation system and not directly tied into the compiler, as far as I can tell.

Also, regardless if the expression is ambiguous or not, I should also be able to look up all of the possible alternatives for that operator, including alternatives that were decided against because they don't fit.

I thought one of the features of Swift was that it doesn't implicitly type cast? It seems like the problem is not that A+B→A exists, but that both A+A→A and A+B→A exist. (Sometimes you legitimately need some operator ⋆ where A⋆B→A, but such operations are mutually exclusive with A⋆A→A.)

David_Smith · May 8, 2025, 4:33pm

As a reminder, there is (not especially low hanging but quite significant) fruit being picked as we speak, it was linked further up this very thread: [ConstraintSystem] Return of the new disjunction favoring/selection algorithm by xedin · Pull Request #79461 · swiftlang/swift · GitHub

bdkjones · May 8, 2025, 5:05pm

Is there any data on how much of an improvement that will make in a typical SwiftUI app? I don’t see any benchmarks or A/B test results on the PR.

Lancelotbronner · May 8, 2025, 5:05pm

There isn’t implicit conversion? I wrote that it marked it as an error?

In reality I believe Swift does have some implicit conversion but only hardcoded within the compiler.

David_Smith · May 8, 2025, 5:09pm

All the numbers I've seen have been in the context of particular slow-to-check expressions, not whole apps, so not sure on that.

bdkjones · May 8, 2025, 5:37pm

Well, hopefully it’s a big impact. Guess we’ll find out in a month.

I almost wish WWDC would do “reverse workshops”, where app developers sit with the compiler team and demonstrate how poorly the tooling performs in real apps.

I suspect the engineers have a false sense of the problem because if you spend 90% of your time hacking on the compiler and 10% building little demo apps for WWDC, you probably conclude that performance is pretty okay and the developers complaining on the forums are either dumb or writing SwiftUI incorrectly.

audulus · May 26, 2025, 8:24pm

Revisiting this, I don't understand why the type checker can't proceed in a deductive way without any guessing BEFORE trying guessing (branching).

So for this example which fails to type check:

let x: [Double] = [] + [] + [] + [] + [] + [] + []

we know that x: [Double] so we can conclude that [] + [] + [] + [] + [] + [] + [] : [Double].

Now consider that long expression to be lhs + rhs : [Double] as it would be in the AST.

We can look up all the overloads for +. There will be arithmetic ones we can eliminate because they don't return an array. Presumably only one of those overloads has a return type that unifies with [Double]:

public static func + (lhs: [Element], rhs: [Element]) -> [Element]

And so you can conclude that lhs: [Double] and rhs: [Double] and proceed from there.

Now, if you were silly enough to also have other overloads of + which return arrays of things, then all bets are off. Is that the case for Swift?

Or perhaps this method is just good for these toy cases but SwiftUI is such a beast of overloads that it wouldn't really help?

I'm especially curious about this because I wrote a constraint solver for a simple language which just iterates over the constraints and makes deductive simplifications, but never guesses/branches, and just asks the user to add type annotations if it fails to solve for all the type variables (here if you're curious lyte/src/solver.rs at main · audulus/lyte · GitHub)

xwu · May 26, 2025, 8:58pm

Ah, but I'm allowed to conform any type that I want to ExpressibleByArrayLiteral and write arbitrary custom implementations of +(_:_:) that return [Double].

extension Int: @retroactive ExpressibleByArrayLiteral {
    // Retroactive conformance for clarity of exposition only; I can just
    // as well write a custom type and not use a retroactive conformance.
    public init(arrayLiteral: Never...) { self = 42 }
    public static func + (a: Self, b: Self) -> [Double] { [Double(a + b)] }
}
let x: [Double] = [] + [] + [] + [] + [] + [] + [] // [294.0]

(Yes, this compiles and actually gives you this result. Yes, it's supposed to compile and actually give you this result.)

audulus · May 26, 2025, 9:43pm

Sounds like what I said above, no?

Now, if you were silly enough to also have other overloads of + which return arrays of things, then all bets are off.

So yes, you can write silly overloads for things like your example. But do they already exist such that a simpler type checker would be thwarted in reasonable cases?

AFAICT, in your example, the no-guessing type checker would see both:

static func + (a: Int, b: Int) -> [Double]

and

static func + (lhs: [Element], rhs: [Element]) -> [Element]

And given that both return types unify with [Double] it would conclude that guessing is necessary, then give up and let the slower checker take over.

I do find it interesting that adding the conformance causes the expression which formerly took too long to type check to actually finish type checking. Why is that?

scanon · May 26, 2025, 10:09pm

The “type checker” does not only assign types to expressions. It is also responsible for function overload resolution.

audulus · May 26, 2025, 10:14pm

Why did you put type checker in quotes? Does that mean I'm misusing the term?

scanon · May 26, 2025, 10:15pm

Because the name sometimes leads people to believe that it only has to assign types to expressions.