[Re-Proposal] Type only Unions

wadetregaskis · June 28, 2024, 4:59am

The other thing to consider with any smart deductions (as opposed to explicit requirements - A vs B below, respectively) is the compiler diagnostics. It's probably going to be easier for the type checker to deduce the correct intent, and therefore to give sane error messages, if the rules are more explicit.

// A:
func foo(_ value: Array | Set) -> Bool {
    value.isEmpty // Deducable from Array & Set both being Collections.
}

// B:
func foo<T: Collection & Array | Set>(_ value: T) -> Bool {
    value.isEmpty // Trivial because T is explicitly a Collection.
}

Possibly related to me spending all afternoon today debugging and trying to optimise type checker timeouts in SwiftUI code…

hishnash · June 28, 2024, 6:33am

From a syntax persecutive I do not like this.. I find it possibly confusing.

is it saying Collection & ( Array | Set) or is it saying (Collection & Array) | Set. If we want to let people combine | with & within the type signature then maybe it should require the placement of braces. The same goes with the use of any A | B is this any A | any B or is this (any A) | B.

Just spit balling but would a more explicit form be better.

func<T>(_ value: T) where T: Collection, T == A | B  { ... }

// and for inline

typealias Result = A | B where Result: Collection
let result: Result = ...

wadetregaskis · June 28, 2024, 1:56pm

I don't think it's core to the discussion here in any case. It's no different to any other, existing situations regarding operator precedence.

Jumhyn · June 28, 2024, 2:28pm

I would just re-up John's post from earlier in the thread:

It is of course not impossible for previously-commonly-rejected changes to eventually find their way into the language, but I want to emphasize that there would be a substantial uphill battle, and that the task at hand is likely less 'how should the feature be designed' and more 'why are prior objections philosophically mistaken in some way'. Take for example one of the earlier efforts to revive the discussion on if expressions: a core part of the argumentation is that prior discussions have not sufficiently appreciated the user experience difficulties of existing alternatives like the ternary expression.

Slava_Pestov · June 28, 2024, 7:43pm

I'm curious where you get this impression. The evolution of generics has pretty much just been additive: opaque return types, more general existentials, parameter packs and move-only types. I think the core model with runtime metadata, protocols and where clauses, etc was there in the first public release of Swift, which was before my time, and certainly it was all in place by Swift 2.0.

miku1958 · June 29, 2024, 8:50am

I rewrote Motivation based on Jumhyn's view.

Here's a draft, which I'll still update in the main post in a few days.

Motivation

If a developer now wants to match multiple types in Swift, here's how to do:

enum

enum CodablePrimitiveValue: Codable {
    case string(String)
    case int(Int64)
    ...
}
let value: CodablePrimitiveValue

switch value {
case .string(let string):
    print(string)
case .int(let value):
    print(value)
...
}

Advantages of enum:

The available types are fixed, and can be iterated over with (switch).
All available types can be quickly accessed via the (. ) syntax.

Disadvantages:

Need to unpack one extra time to use internal value's methods/properties

let value: CodablePrimitiveValue
switch value {
case .string(let value): encode(value)
case .int(let value): encode(value)
...
}

protocol
```
protocol CodablePrimitiveValueType: Codable {
}

extension String: CodablePrimitiveValueType {
}

extension Int: CodablePrimitiveValueType {
}
...

let value: CodablePrimitiveValue

if let value = value as? String {
    // ...
} else if let value = value as Int {
    // ...
}
...
```
Advantages of protocol:
- Developers are free to expand the supported types without worrying about the stability of the API.
Disadvantages:
- If the API provider doesn't implement the shortcut static property manually, the caller needs to manually find the type that implements the protocol, and there are plenty of APIs for this in SwiftUI:
```
extension PrimitiveButtonStyle where Self == BorderlessButtonStyle {
    public static var borderless: BorderlessButtonStyle { ... }
}
```
- API providers can't restrict specific types.
- Due to the uncertainty of the type, the compiler can't optimize it either
  - e.g., if the API provider provides multiple types, but the caller only uses one of them, the compiler should be able to optimize for the API if it's inlinable.

Enum and protocol also have these disadvantages:

New datatypes need to be defined for constraints, which increases the binary size, I'll explain later why this problem is solved based on compile-time unions

When all internal values have some kind of commonality (protocol or super class), there is no way to use it directly.

enum needs to implement an additional method to return a value with this commonality

extension CodablePrimitiveValue {
    var value: Codable {
        switch self {
        case .string(let value): return value
        case .int(let value): return value
        case .uint(let value): return value
        case .bool(let value): return value
        case .double(let value): return value
        case .null: return String?.none
        }
    }
}

protocol needs to be inherited or type-restricted by where, but restricting by where only restricts one type, and cannot be extended further.
```
protocol CodablePrimitiveValue: Codable {
}
```
```
class PrimitiveValue {
}

protocol CodablePrimitiveValue where Self: PrimitiveValue {
}
```

function overloading

func encode(_ value: String) {

}
func encode(_ value: Int) {
    
}

Advantages of overloading:

Intuitive, compiler can match and optimize better

Disadvantages.

When there are a lot of parameters, it's a pain to match multiple versions for just one parameter.

func makeNetworkRequest(
    urlString: String, method: String, headers: [String: String], body: Data?, timeout: TimeInterval, cachePolicy: URLRequest.CachePolicy, allowsCellularAccess: Bool, httpShouldHandleCookies: Bool, httpShouldUsePipelining: Bool, networkServiceType: URLRequest.NetworkServiceType, completion: @escaping (Result<Data, Error>) -> Void
) {
    let url = URL(url)
    makeNetworkRequest(url: url, ...)
}

func makeNetworkRequest(
    url: URL, method: String, headers: [String: String], body: Data?, timeout: TimeInterval, cachePolicy: URLRequest.CachePolicy, allowsCellularAccess: Bool, httpShouldHandleCookies: Bool, httpShouldUsePipelining: Bool, networkServiceType: URLRequest.NetworkServiceType, completion: @escaping (Result<Data, Error>) -> Void
) {
    let urlRequest = URLRequest(url)
    makeNetworkRequest(urlRequest: urlRequest, ...)
}

func makeNetworkRequest(
    urlRequest: URLRequest, method: String, headers: [String: String], body: Data?, timeout: TimeInterval, cachePolicy: URLRequest.CachePolicy, allowsCellularAccess: Bool, httpShouldHandleCookies: Bool, httpShouldUsePipelining: Bool, networkServiceType: URLRequest.NetworkServiceType, completion: @escaping (Result<Data, Error>) -> Void
) { 
    ...
}

At this point the developer is forced to use a protocol or an enum, which brings us back to the previous problem.

protocol Requestable {
    var urlRequest: URLRequest { get }
}
extension String: Requestable {
    var urlRequest: URLRequest { ... }
}
extension URL: Requestable {
    var urlRequest: URLRequest { ... }
}
extension URLRequest: Requestable {
    var urlRequest: URLRequest { self }
}
func makeNetworkRequest(
    urlRequest: Requestable, method: String, headers: [String: String], body: Data?, timeout: TimeInterval, cachePolicy: URLRequest.CachePolicy, allowsCellularAccess: Bool, httpShouldHandleCookies: Bool, httpShouldUsePipelining: Bool, networkServiceType: URLRequest.NetworkServiceType, completion: @escaping (Result<Data, Error>) -> Void
) { 
    ...
}

Anyway, can Swift currently match multiple types? Yes, but it's really cumbersome and hard to use, and that's the problem this proposal is trying to solve: Swift currently lacks a syntax that works well enough to match multiple types.

shawnthroop · June 29, 2024, 11:04am

I feel similar. I’m more of a lurker on the forums and I’ve been confused for a bit now with the desire to make the Swift’s Type system more complicated. Maybe it’s my inexperience showing through but I can’t seem to map the examples in this post to real world problems I might have.

I’m not saying the proposal won’t help some people; me being an idiot doesn’t lessen something’s usefulness. However, this seems like a lot of complicated changes for something the Type system was designed to avoid?

Pampel · June 30, 2024, 10:05pm

Union types don't solve a problem, they just save you the effort of giving something an explicit name. I dislike them for the same reasons I don't like tuples, except that tuples have the saving grace of being really, really useful as intermediate types in map / reduce etc chains.

I'd only be in favour of union types if they were forbidden in method/function signatures in anonymous form and had to be named, but enum already gives us that.

ksluder · July 1, 2024, 12:45am

To be completely fair, union types differ from enums in one fundamental way: union types are structural types, while enums are nominal types. This means that two functions which return (T | U) return the same type, even if they are in completely separate modules with no common imports. Not sure that’s a strong argument for adding them, though.

Pampel · July 1, 2024, 8:42am

For me, that's a solid point against union types. Structural typing feels to me like typing by coincidence, and the 'anything that looks the same is the same' thinking gives us all the problems of primitive obsession, just with more complex types.

wadetregaskis · July 2, 2024, 10:55pm

Well, this is intrinsic to virtually every use of types, in every language.

func square(_ x: Int) -> Int

That technically works with the number of apples, how many emails to send, ages, phone numbers, etc. It is of course logical gibberish when applied to some types of integers.

(T | U) for the same T & U is literally the same thing irrespective of where it's defined, just like all Ints are Ints even if they might carry nuance in each specific application. Yet square is still by most accounts a perfectly valid and useful function.

Tuples already work this way too, and while I know you already said you don't like that fact, my point is that it demonstrates that it does in practice work just fine in Swift specifically (and you can largely choose to not use tuples, if you want - when they appear in APIs you don't control it's usually just as the return type, and you can easily compartmentalise that if you wish - which is likewise true for what's proposed here).

Pragmatically, the problem with nominal typing is coordination. Somehow, somewhere you need to define the canonical version of any given type. You then have to have everyone use that definition. e.g. depend on that exact Swift package. Which immediately gets rejected in many cases because (for better or worse) many folks don't like additional dependencies.

It works okay in limited environments (e.g. within one, unified organisation) and limited cases (e.g. the most critical, foundational types, like those in the stdlib), but it scales poorly otherwise.

Just look at how many IPV4Address definitions there are on GitHub that are just wrappers over uint32_t, and all the code that depends on them and is completely incompatible because of the nominal but not functional difference in type.

ksluder · July 3, 2024, 3:12am

I think you meant to say “the problem with nominal typing is coordination.”

wadetregaskis · July 3, 2024, 3:34am

Hah, yes, sorry - I corrected my post.

ktraunmueller · July 3, 2024, 6:45am

My thoughts as an average user of the language:

I think this is an addition to the language that is not necessary -- it would only make the language bigger, but not better. A feature like this would be endorsed by some developers, while being avoided by others. Which could make it a lot harder for one camp to review code written by the other camp. I could imagine tension building up in a team of developers containing advocates of both approaches. And completely avoidable so.

Keep it simple. Simple is good.

miku1958 · July 3, 2024, 7:22am

What you said happens with every Swift version update, by which you mean Swift should archive now without any new features.

In my opinion Union replaces the example I wrote in Motivation to make Swift code cleaner and easier to understand, this is proven in other languages.

But again, if you don't like a language that keeps adding new features then you should use C, which I'm sure is stable enough for you.

Gero · July 3, 2024, 7:59am

With all due respect, but that is not what @ktraunmueller (and others) necessarily say(s).

While I myself have no strong opinion on this, I don't see a compelling enough reason to include this in the language.
That does not mean I am against any changes in principle. Again with all respect, please don't accuse people that don't want this specific change to be against all change and tell them to "go back to C land".

Any change/addition to the language can lead to fracture the user-base to some extent, that is always a cost that has to be considered. In some cases the community may find this outcome unlikely or think the feature is worth this cost.
Finding out the balance is what the evolution process is all about.

Here we see some people argue for union types (though there seems to be no strict consensus what that entails) and some argue against it. implying the latter group is blocking progress of the language is not productive.

On the matter specifically, I do see the effort enums and protocols may require in the motivation and can understand people may want to get a "shorter" way to do it. However, I fear that this may lead down a path that is too "fuzzy" or "laissez-faire", in a way. When designing my code I usually try to have a robust type structure and when I encounter a situation where I have to become overly verbose I rather question my approach in general and think of redesigning my type relations in a way that allows me an easier way to express whatever I need.

So far, Swift has always allowed me to do that, more or less. What I would need to be convinced more of the value of this proposal is a concrete example where this lead to real, big problems that could not be resolved in some other way.

I do understand the examples in the motivation, ofc, but as someone who is mainly doing app development, I can't come up with a real world scenario like this. Perhaps that's a shortcoming on my part, but it's why I am not convinced.

miku1958 · July 3, 2024, 9:05am

I apologize that my wording may not have been as accommodating as it could have been for everyone, and I'd welcome a more meaningful rebuttal, rather than something like this feature will make Swift more complex and bloated, or something that has no progressive value. I'd still say C is better for them.

Regarding specific examples, I think the current ones that can be optimized by Union are Typed throws, which can only handle a single Error type, or wrapping a layer with an enum/protocol, or fallback to any Error, either of which is very cumbersome, and something like PrimitiveButtonStyle in SwiftUI, Apple needs to add a .borderless property for PrimitiveButtonStyle so that developers don't need to look up the documentation to know what they can use it for. Also when overloading a method, if the method has a lot of parameters, you have to repeat a lot of pointless writing and so on. Either way, Union is a much more efficient and simpler way to solve these problems, and I can only say that, like Tuple, it's not a must-have, but when it works, it really works!

I believe that the language is progressing in the sense that it is becoming more and more efficient for developers to use, not in the sense that developers are rethinking what they're doing wrong in these weird places.

Matt_McLaughlin · July 3, 2024, 10:50am

Broadly speaking I see declaring an enum or protocol (instead of a union type) as having more ceremony at the declaration site but greater conceptual clarity for readers of the code and greater clarity at the use site. It’s a tradeoff.

And it comes down to which you think is more important. Personally I prefer the ‘extra’ ceremony in the service of greater clarity. I think union types make the type system harder to reason about in exchange for ease of writing code. That’s a tradeoff that I don’t think is worth it.

—-

When looking at the motivating examples and listening to the ongoing conversation, it’s clear that the locus of concern is really typed errors. I wonder if there’s a narrower solution that’s specific to error types that would be able to cover 80-90% of the use cases that people care about.

Typed errors are also quite new. I worry about proposing union types to ‘fix’ the typed error experience with so little time in use. I would be much more comfortable with another year or so of typed errors in the wild to gain collective experience before tweaking the ergonomics.

miku1958 · July 3, 2024, 11:07am

I totally understand the need for ceremony, but that doesn't mean they're necessary every time. If a place that covers multiple types is good enough to understand with Union, then I'll use Union if Union is available, rather than using enum or protocol to solve a problem that doesn't fit them. It doesn't mean that with Union I can't solve multi-type problems in other ways, just as Tuple can be used to solve specific problems, not that it can only be solved with Tuple.

michelf · July 3, 2024, 11:42am

Personally, instead of adding a new kind of type, I would just introduce the concept of a closed protocol. It's not exactly the same as a structural union type, but it's so similar to normal (open) protocols that it barely add any new complexity to the language while covering pretty well what you'd do with a union type.

closed protocol Acceptable {}
extension Int: Acceptable {}
extension String: Acceptable {}
// conformances are only allowed in the same module so the compiler
// always knows the full list of conforming types

In usage it works exactly like any other protocol, with only a bit of convenience added, like we can exhaustively switch over the the existential box:

func accept(_ value: any Acceptable) {
   switch value {
   case i as Int:    print("accepting Int \(i)")
   case s as String: print("accepting String \(s)")
   }
}

(We might need an @unknown default in other modules, like for enums, for case new cases that could be added later.)

Usage:

accept(1)
accept("hello")

var a = 1 as any Acceptable
a = 2
a = "hello"

The main difference would the that this protocol's existential box is more efficient as its underlying implementation could work like an enum. And method dispatching could also be done with a switch instead of witness tables. I suppose this would be beneficial for embedded Swift.

And if we define another protocol encompassing all the same types (opened or closed), casting with as is allowed. For instance:

protocol Rejectable {}
extension Int: Rejectable {}
extension String: Rejectable {}

let r = a as any Rejectable 
// casting allowed without as? or as! 
// because all Acceptable types are also Rejectable

Here we can use as to convert from one type to another because all the types in the closed protocol are known to be compatible with the requested type. This is not an implicit conversion though: just an explicit cast that can't fail. We continue to use as? if there's a chance the value is not part of the destination type:

let i = a as? Int
let b = a as? any BinaryInteger
// casting allowed but may fail

Whether this approach is beneficial for typed throws is another question though. If you want to automatically combine all the types thrown in a do block to form a union type, you need to be able to merge types to form a new union type. This is a bit muddy with protocols.

With closed protocols you'd have to write things like this:

do throws(any AcceptableError) {
    try accept()
    try acceptAgain()
} catch {
    // ...
}

This already works with a normal (open) protocol, so it'd be nothing new... except now you can catch exhaustively all the types in the closed protocol (like with the switch above). The price is you have to choose the protocol beforehand in do do throws(...).