Fixing the type system

This isn't going to make me popular but has to be said. I don't yet have an indepth knowledge of the Swift type system but it appears to me serious damage has been done changing functions so they no longer take a single argument. Although this is sustainable it is very hard to do so. In this article I'll briefly explain the problem and then show (partly) how to fix it.

Mathematical functions have one argument, and all type systems are based on mathematics. The core modelling structure used is the category, which is a collection of types and functions between them, in which, an associative binary partial operator called composition is provided. Composition is a key thing in programming. Generally several other features are provided: a cartesian product including its identity the unit tuple, and its dual sum type, sometimes called a variant, an enum in Swift, and its identity, the void type. Advanced languages also provide an exponential, written A -> B, which is a type representing function values. Note these are not the same as the functions between types, there's a difference between a function and its closure. In the type system, it is usual to also have a fixpoint (recursion) operator to simplify the typing and analysis of inductive types.

Programming is a constructive activity. Its not enough to have product types, we have to have a way to make them. This is typically done by providing a tuple constructor. If TYPE is the category we have been describing, then a pair is built with a bifunctor: pair: TYPE * TYPE -> TYPE, where TYPE * TYPE is the product of the TYPE category. Typically, languages provide a family of N-functors, for N a natural number. This allows functions to operate on multiple arguments in a two step process: first combine multiple values into a single value, so then the function can be applied. Note very carefully, the bifunction pair is NOT a function. It does not live inside the category TYPE.

It is also possible to represent every function by a combination of an N-functor and the base function taking only one argument. This is what Swift has done. Unfortunately this removes the whole advantage of using a single pair of operations: tuple formation and function specification, for no benefit. In fact, the downsides are so large it makes the language unsustainable.

Consider the trivial problem that a function with two arguments cannot be applied to the result of applying another function. Composition is lost, and with it the underlying algebra is destroyed. For example you now have to write this:

let f: (A,B)->C = ...
let g: E -> A * B = .. 

let e: E = ..
let (a,b) = g(e) // untypeable
let c : C = f (a,b) 

I wrote the type g returns as A * B meaning a tuple. Swift used the wrong notation for tuple types. The LHS of the g application, and the argument of F do not have an actual value type. Its a disaster. The effects ripple throughout the type system. Type systems must be combinatorial. Nominal typing already escapes compositional behaviour deliberately, but the structural core must support composition.

Luckily there is a way to recover sanity. This is the very highest priority for the compiler team.

For every function with multiple arguments, there is a function taking a tuple instead: in the example:

func f_pack (x:(A,B)) -> C { return f(x.0,x.1); } 

Conversely, for every function taking just a tuple, there is a function taking a list of the components instead:

fun f(a:A, b:B)->C { return f_pack ((a,b)); }

Generalising, there are two "generic" operator that can perform these operations. These operators are not needed for direct calls. As shown, the user can write them, even if its painful.

But for generics, they're absolutely essential. Stuff cannot be done at all without them. The alternative is to put families of generics in the library. Since such families are necessarily finite they can never cover all cases. If two functions are involved, a quadratic number of functions is required, if three its a cubic.

The obvious example in composition, which is linear:

operator compose<A,B,C> (f: A -> B, g:B -> C) { 
  func eta(a:A)->C { return f (g (a)); }
  return eta;
}

Its fine, for one argument, but if f has two arguments, you have to write out another function which is a bit more verbose. If its three, its even worse. And this operation only involves one case of multiple arguments where the functions compose.

But with the two operators I specified, only the given generic is required, because we can just apply the pack operator to convert f to accept a tuple.
In higher order operations labels are an impediment, so they should be stripped by the pack operator as well.

The concrete syntax is a mess. Since (A,B) is a tuple, but (A,B)->C is not a mapping from a tuple but a function of two arguments, a function taking the corresponding tuple would have to be written ((A,B))->C. The notation (A,B) for a tuple was always wrong, it should have been A * B, the correct interpretation of (A,B) is that it is a pair of types: a product type is a single type.

The actual names of the operators I called pack and unpack don't matter, but it would be useful to have a single character for them, because they will have to be used extensively to recover sanity.

Adding these operators saves reverting the change to multiple arguments. I have no idea how that happened but it is probably not possible to undo the damage with a second major change, at least for some time. Providing the pack and unpack operations recovers the ability for generics to actually be general in a finite set of specifications instead of requiring an infinite family.

I will note that, if the non-associative N-ary constructor used for N tuples is augmented by a second type term, representing the same tuple type with a right associative head and tail form, then variadic generics are not required at all, since the head/tail form of a tuple is sufficient to recursively analyse any tuple.

Estimated time to implement: one day.
Impact on existing code: none, provided the names of the operators don't cause a clash.
Utility: Without these operators further evolution of the language is permanently stalled. No sane higher order operations can be provided. A type system in which the domain of a function doesn't have a type does not admit complete finite generic specifications.

4 Likes

See the following thread, and related threads:

1 Like

To leapfrog the huge backlash that will most likely start soon ;-):
Your opinion isn't as unpopular as it may seem — actually, Swift was build on the premise that there are only functions taking a single argument, so this definitely something Core wanted.
Sadly, it didn't work out, and there was some big reconstruction (not even sure if it's 100% finished yet).
I guess the problems that are responsible for the change can be solved when you accept to pay a price for it — but I don't think that will happen in Swift.

This already works in Swift with multiple arguments / return values (though I will note that you wrote your example wrong). Here is a working version with an example:

func compose<T,U,V>(_ f: @escaping (U)->V, _ g: @escaping (T)->U) -> (T)->V {
  return { f(g($0)) }
}

func duplicate(_ x: Int) -> (Int, Int) { (x, x) }
let addToItself = compose(+, duplicate)
let six = addToItself(3)

Yes, but:

//                            Are these two the same type or not?
//                                 .--'-.     .-'--.
func identity<T, U>(_ f: @escaping (T)->U) -> (T)->U {
    return f
}

func add(_ a: Int, _ b: Int) -> Int { return a + b }

print(type(of: add))           // Prints (Int, Int) -> Int
print(type(of: identity(add))) // Prints ((Int, Int)) -> Int

print(type(of: add) == type(of: identity(add))) // false

let f: (Int, Int) -> Int = add
var g = identity(f)
// g = add // ERROR: Cannot assign value of type
//        '(Int, Int) -> Int' to type '((Int, Int)) -> Int'

// Workaround:
g = { f($0.0, $0.1) }

// Have to call it with double parens like this:
let r = g((1, 2))
print(r) // Prints 3

It will work (same type, no double parentheses etc) if we use add1 instead of add though:

func add1(_ a: Int) -> Int { return a + 1 }

It looks like (T)->U and (T)->U in the definition of identity are the same type. But they are evidently not always the same.

So how do we write the identity function in Swift?

func identity<T>(_ f: T) -> T {
    return f
}

?

I know this is the identity function, but I mean one that takes only function types, and returns the same function type, no matter the number of parameters of that function type.

Maybe overload it for each number of parameters you want to support? No, that will result in Ambiguous use of 'identity'.

You can't today.

Your identity function is a "true" identity function, but it only takes a function of one parameter and one result. The type checker has an explicit rule that a function of multiple arguments can be converted to a function of one argument taking a tuple argument. So when you pass in a function taking multiple arguments, such as +, its converted into a function taking one argument at the call site.

This also explains why type(of:) produces a different result for add vs identity(add). When you pass add as an argument to identity, the conversion takes place since the argument type is (T) -> U. So the return value of identity() has a different type than add itself, because the conversion changes its type to ((Int, Int)) -> Int.

5 Likes

The first part of this sentence explains the second.

Swift spent a very long time—years—fighting and losing the battle to squeeze argument lists into tuples. It seems like a good idea in principle, but when you dig into the details of the behavior Swift actually wants from these constructs, there are just too many mismatches. Unlike tuples, argument lists need to support:

  • Single labeled elements
  • Separate formal parameter names
  • Default arguments
  • Variadic arguments
  • Autoclosures
  • Function builders
  • Explicit annotations for escaping closures
  • Ownership modifiers
  • inout
  • Implicit conversions of various types to pointers (for C compatibility)
  • Rigid ordering rules

And probably a few more things I've forgotten. We tried to implement these behavior differences despite using tuples as argument lists, but we were constantly fighting compiler bugs where these features snuck into ordinary tuples and broke invariants left and right. And we were not willing to give up these differences in behavior—they would be enormous usability regressions that would, frankly, make Swift unsuitable for at least one of its primary use cases (interacting with C and Objective-C libraries).

So we gave up the composability you're looking for to improve everything else about the language. And, given the context and history that led us to that point, it was the right decision.

I hope that we do eventually reintroduce some limited, controlled ability to flatten tuples into argument lists. But it's far from the most urgent problem we're facing, and it will need to be done with awareness of the project's history. It certainly will take more than

27 Likes

We'll see :-) In any case the point of my post is that there is at least one way to repair the damage. I might not have got all the details right, I'm still trying to learn Swift. The language "specification" unfortunately is not very precise. The guide is actually better. So I hope my attempts to write Swift are corrected by those that know the syntax better. [I'm still grappling with the protocol syntax..]

Ah cool, an even more fundamental example than composition. The problem is simply that the domain of a function has no type.

Now I'm just beginning to think there may be a solution in which the current "domain" is STILL actually a real tuple! It's related to my previous idea that an alternate "head/tail" representation of tuples allows folding them with polymorphic recursion, and thus allows implementing some protocols for tuples in user space. Ignoring labels for the moment, perhaps we can view the "parameter list" of a function domain as a tuple in that alternate form. [the syntax could be an issue though] It's really hard to deal with because there is no type for the domain of the function, but the LHS part of the function arrow is an arbitrary length list of non-unified domain components instead. It quite clearly has much the same semantics as an actual tuple, you just can't use a single tuple as an argument.

From what I gather:

  • Function should take single argument, and return single argument. That is the only sane option.
  • The current implementation mismatch with that scenario.
  • We can mitigate the disparity using pack/unpack.
  • (A, B) should be spelled A * B

Joke bullet aside, what does sanity bring to the language? I personally never want something similar, and I want to understand the motivation better.

You are correct, I do not have in-depth knowledge of Swifts history. And I accept, interfacing to Objective C is an essential constraint. Also, thank you for taking the time to fill me in on some history!

However, most of what you list as wanted can by done still using tuples, as I have done most of it myself, with constraint that my language Felix had to interface to C and C++ (rather than ObjC).

The key to understanding how it can be done is to understand that the special requirements only need apply to direct calls, and not the type system. That's how I did it for the most part.

I can do default arguments, and named parameters in Felix. Using tuple arguments. It's also possible to make a function take a record instead of a tuple, in which case the component labels are part of the type system. In fact, any function accepting a tuple will accept a record with the labels of the record agreeing with the parameter names. The type of the function domain is still a tuple, so this would NOT work for a closure of the function, but when calling the function directly, the compiler injects a coercion to make it work. Actually in Felix, tuples are a special case of records where all the labels are "blank". In fact arrays are just tuples with all the same component types. So actually, a single structural type can be used for may apparently distinct things.

I do not really understand why you need the labels to be part of the type system, but as I said that's no problem, just use records. If you do that, you still have a type for the function domain. Of course higher order operations like generic composition then become harder.

[quote]

  • Variadic arguments
    [\quote]
    Done in Felix a few days ago .. because i finally decided I wanted to bind to variadic functions in C. Its a bit harder to write a receiver in Felix, i haven't bothered because there is no real utility.

  • Autoclosures

Unless I'm mistaken this is a syntactic issue, so it isn't relevant. Correct me if I'm wrong.

  • Function builders

Totally messed up reasoning. The whole reason for using tuples is to make it easier to do higher order operations with functions.

  • Explicit annotations for escaping closures

I don't see the relevance immediately. An escaping closure, as I understand it, has a different type to a non-escaping one, with subtyping rules to ensure safety is not violated. Again this is clearly easier to support with tuples.

  • Ownership modifiers

Again, easier to do if done with tuples. in Felix I have both uniqueness types and linear functions. Linear functions fit in seamlessly. My implementation of uniqueness typing does not work so well.

  • inout

i can do this easily, but using a different mechanism than Swift. Its one of the advantages of having pointer types. Given that swift doesn't want to give users pointers explicitly, there may be a challenge to do this, but it can surely be done because it does not provide any semantics beyond what you can do with pointers, which Swift uses underneath anyhow.

  • Implicit conversions of various types to pointers (for C compatibility)

I dont see how this is an issue. I can model all C types in Felix easily. [I have more problems with C++ OO stuff because Felix has no compiler supported OO]

  • Rigid ordering rules

I don't understand due to lack of knowledge. Perhaps you could explain?

[quote]

And probably a few more things I've forgotten. We tried to implement these behavior differences despite using tuples as argument lists, but we were constantly fighting compiler bugs where these features snuck into ordinary tuples and broke invariants left and right. And we were not willing to give up these differences in behavior—they would be enormous usability regressions that would, frankly, make Swift unsuitable for at least one of its primary use cases (interacting with C and Objective-C libraries).

[\quote]

Thanks for the background!

No, it won't. I'm not talking about introducing a massive change. I'm talking about adding TWO generic operations with entirely local impact. So seriously, it will not take long for someone who knows how the compiler works. I'm suggesting to provide what I called:

  • pack
  • unpack

which convert functions from multiple arguments to using a single argument which is a tuple, and the reverse. These functions have no impact on any other part of the language, and can be isolated easily in the compiler to a single place in the code. The operations are entirely inside the type system.

I may be wrong that providing them actually does what I think: give you composition and other similar higher operations, and in particular, it gives you the ability to write generics for single argument function alone, and not have to worry about functions with multiple arguments, because you can always convert a function or closure to that form.

With some guidance as to the structure of the compiler (which I haven't looked at yet) I can probably do it myself in a couple of days, and probably about a page of code for each operator. My guess is backed by doing this kind of thing quite a few times in my own compiler, the only difference being that mine is written in language that's really good for compiler writing, namely Ocaml.

Of course I could be wrong. As to priorities, its experience: no matter what end users think, getting the type system built correctly so it is combinatorial is a pre-requisite for generics. Swift already has a lot more than type safe parametric polymorphism, it has type classes (protocols) as well. So it is already quite advanced. And it has existentials as well as associated types (output types). There is more to be done, including type recursion which also doesn't work, but my concern is not doing it now, but making sure it CAN be done down the track.

I mean this is about "Evolution" and it won't evolve but devolve if this problem isn't solved. I've seen this happen, time and again. Most programmers only want to work at the concrete, monomorphic level, except perhaps it nice to use generic containers a bit. But as system progress the need for higher stuff becomes greater, and if it isn't provided the language simply cannot scale. So OK, iPhones aren't that big. Yet :smile: [But then, I have a cheap Android]

3 Likes

I wouldn't worry that much. I, with many other people, share your concerns about the lack of composability in the language: in my opinion, proper composability should be one of the main principles (actual principles, that is, axiomatic guidelines) in designing a programming language and its type system. Unfortunately, we've seen throughout Swift history that things are not as linear and clear cut as the mathematics behind a type system would suggest, and some "principled" solutions can cause more harm than good. Still, pragmatism is fine, but should never go in opposition to solid underlying theory, otherwise the consequence would simply be a mess, something that can be seen in many other languages.

I really, really hope that your post will kickstart an interesting discussion on the topic.

7 Likes

The notation for tuples is wrong but its only syntax. Your summary is accurate. The utility is only apparent with higher things, such as generic functions, lambdas, etc. At that level, the ability to combine and compose types and values is essential.

At the monomorphic concrete language level most code is written it isn't necessary, although it can be useful. The problem is something like this: in Felix you can write:

fun andnext (x:int) => x, x + 1;
var result = andnext 42;
fun add (x:int,y:int)=> x + y;
var a,b = result;

println$ 
  result.add,  
  42.andnext.add, 
  add (andnext 42), 
  add (42,43),
  add (a,b),
  (let q,r = result in add (q,r))
;

I've done the same thing a couple of different ways. Since "add" takes a tuple, it can be applied to tuple value, and composed directly with a function returning a tuple. Note applying a function like f x doesn't require parens, and, you can also do reverse application x.f which is exactly equivalent.

In Swift, your options are limited. You have no choice, is add takes two arguments, of unpacking the result of addnext into two variables so you can call the add function. You cannot pass the result tuple directly, nor apply add directly the application of andnext (unless there is hackery in the compiler to support it in breach of the typing rules). You only have one option:

var a, b = addnext 42
print (add (a,b))

this is really super ugly and horribly bad! Its a simple calculation that you should be able to write as an expression. If you have functions that accept multiple arguments (whatever that means) you should also have functions that can return multiple values (in the same sense as whatever the multiple argument means) so you can compose things in expressions.

Have to write compositions out in pieces is lame but doable in the concrete, non-generic level of the language. But it is completely untenable when dealing with generics, because you want to use a single type variable to cover all cases, and you cant. In particular you cannot cover all cases, ever. Only all the cases up to say 3 or 4 arguments after which you have to give up or the library gets too bloated.

[BTW: if this works in Swift its either a bug or a hack. Someone wrote a compose that works and .. well it shouldn't. If you can use a tuple in place of multiple arguments then a tuple value should work and I've tested that and it doesn't seem to. The thing is a function with a two int argument has a different type to a function taking a single tuple of two ints as an argument.]

The code example earlier in this thread works because of a deliberate design decision in the language. What you call the "pack" operation is already implemented as an implicit conversion -- while (Int, Int) -> () is a different type than ((Int, Int)) -> Int, you can pass the former as a function argument to a call where a parameter of the latter type is expected. The constraint solver inserts a conversion and code generation emits a thunk that wraps the original function.

It's important to note that not all argument lists can be tupled this way -- for example, a function with type (inout Int, String) -> () cannot be converted to a single argument function ((inout Int, String)) -> () because (inout Int, String) is not a valid type. Only the simplest argument lists, which do not have any of the special features Brent listed, can be tupled this way.

What we don't have (yet) is an "unpack" operation taking a function operating on a single tuple and producing a function taking multiple arguments.

As Brent explained, function argument lists are quite different from tuples because they allow some functionality that tuples do not (inout elements, varargs, etc) and vice versa (a tuple (a: Int, b: String) can be converted to (b: String, a: Int) -- function argument lists don't admit such "shuffle" conversions).

It is better to keep the two concepts separate in the compiler instead of concerning ourselves with tuple types that cannot exist as values (eg, a tuple with a single labeled element, or a tuple with an inout element). We used to have these, and we called them "non-materializable types". It made things more difficult because it was harder to enforce invariants.

15 Likes

Ok, that makes sense. Thanks for info!

Actually I don't accept there is any semantics that you can't do with tuple arguments. I think you're confusing the properties of component parameters with the function domain type.

Lets take "inout" as an example. When you pass a reference, and "inout" only works for references, you're passing a pointer. If its a class, fine, already a pointer. If its a value, you already have to take the address.

So there's no problem. If the parameter is value type "inout" the type in the domain is a pointer. The function secretly does the copy in/copy out, presumably just as it does now. So to be clear: you can make the function domain tuple component "pointer to T" or if you prefer "reference to T". The argument tuple then requires a pointer. Swift already requires the user to take the address.

The thing to understand is that when the argument is passed to the function it is pattern matched to get the parameter components named by the user. These are local variables. You can give them properties which need not be reflected in the interface.

The canonical example of this, from C in fact:

void f(int p, int const q) { .. }

The domain is int,int. The const property is a property of the parameter, but not the function type. Swift also already does this in pattern matches:

(var a, let b) = (1,2)

So if the tuple parameter component is marked "inout" it has two distinct effects. First, it makes the function have extra secret code to do the copying. Secondly the type of the domain component is adjusted to pointer. Which the user is already required to pass explicit (unlike C++ references).

BTW: the "implicit conversion" is a bit nasty if you have overloads.
If the tuple matches a function without unpacking, and another with unpacking, what happens?

Although this seems harmless, it may not generalise to type variables, which would make the type system "incoherent" in a technical sense. To be precise if you have a function taking a type variable argument, or, a tuple with a type variable component, or, a list of arguments one of which is a type variable, then when the type variable is replaced by a concrete type, the behaviour must be the same.

The most serious case would be if the polymorphic case type checked, but the instantiation didn't. In that case the type system would be unsound. Note, i'm talking about unconstrained type variables.

I doubt unsoundness. However incoherence seems possible. Not sure. Can you inout a class? What happens if the value type contains classes (i.e. references). Is the copy shallow or deep? If its shallow, the semantics fails to isolate the argument. If deep, I suspect all hell breaks loose. If there are constraints on the type that you can inout, then you can't use an unconstrained type variable. If there is a protocol which determines if something is "inoutable" then it could work with that constraint.

Sorry, but the documentation isn't exactly precise. Any clarification appreciated.

Yes you can:

class C {
    var addressStr: String { "\(Unmanaged.passUnretained(self).toOpaque())" }
    init() { print("initialized:   ", self.addressStr) }
    deinit { print("deinitialized: ", self.addressStr) }
}

func foo(_ c: inout C) {
    print("foo received c:", c.addressStr)
    c = C() // <-- It's inout, which means I can eg set `c` to another instance.
    print("foo assigned c:", c.addressStr)
}

func test() {
    var c = C()
    foo(&c)
}
test()

This program will print:

initialized:    0x0000000102a006c0
foo received c: 0x0000000102a006c0
initialized:    0x000000010060c2d0
deinitialized:  0x0000000102a006c0
foo assigned c: 0x000000010060c2d0
deinitialized:  0x000000010060c2d0

It's shallow:

class C {
    var addressStr: String { "\(Unmanaged.passUnretained(self).toOpaque())" }
    init() { print("initialized:   ", self.addressStr) }
    deinit { print("deinitialized: ", self.addressStr) }
}
struct S {
    var c: C
    init() { c = C() }
}

func bar(_ s: inout S) {
    print("bar got s, s.c:", s.c.addressStr)
    s.c = C()
    print("bar assigned c:", s.c.addressStr)
}

func test() {
    var s = S()
    bar(&s)
}
test()

Will print:

initialized:    0x0000000100600220
bar got s, s.c: 0x0000000100600220
initialized:    0x00000001006248b0
deinitialized:  0x0000000100600220
bar assigned c: 0x00000001006248b0
deinitialized:  0x00000001006248b0

Value types, like Array in the Standard Library, that contains reference types (for eg a private dynamically allocated storage) often implements copy on write though (see discussion section here).

1 Like

Ok, good. You can inout a class. That's consistent. Thanks.

Now my question about a value type like this:

struct X { a:Int; b:Cls;} // Cls being a class type

I would guess if you inout X, the integer a is copied along with the pointer b, you do mods, and then it is written back. So you can, following your example, change b to another instance of the class, and at the end it's written back. However you could also change the contents of what b refers to. In this case, the argument is impacted during the function lifetime, so the change has a side effect during the function call, and transparency is lost.

This is not to say inout is bad. I just want to fully understand the type system and other semantic details. In any language that has procedures (as opposed to mathematical functions), side effects are the necessary (otherwise the procedure can't do anything!) This usually means you can shoot yourself in the foot. The features don't guarantee safety, but just try to help.

Yes.

Why not just try these things out : ) ? Like this for example.
class C {
    var count = 0
    var addressStr: String { "\(Unmanaged.passUnretained(self).toOpaque())" }
    init() { print("initialized:   ", self.addressStr) }
    deinit { print("deinitialized: ", self.addressStr, "having count", count) }
}
struct S {
    var c: C
    init() { c = C() }
}

func bar(_ s: inout S) {
    print("bar got s, s.c:", s.c.addressStr)
    s.c.count += 1
    s.c = C()
    s.c.count += 123
    print("bar assigned c:", s.c.addressStr)
}

func test() {
    var s = S()
    bar(&s)
}
test()

John, I’m not sure what the purpose of this thread is, given that it’s already been made clear that your main proposal of returning to tuple-based parameters is not going to be considered. Is this going somewhere?

Terms of Service

Privacy Policy

Cookie Policy