[Pitch] Type inference from default expressions

xedin · February 24, 2022, 10:06pm

Authors: Pavel Yaskevich
Implementation: apple/swift#41436
Toolchain: macOS

Introduction

I propose to allow type inference for generic parameters from concretely-typed default parameter values (referred as default expressions in the proposal) when the call-site omits an explicit argument. Concretely-typed default expressions would still be rejected by the compiler if generic parameters associated with a defaulted parameter could be inferred at a call site from any other location in a parameter list by an implicit or explicit argument. For example, declaration func compute<T, U>(_:T = 42, _: U) where U: Collection, U.Element == T is going to be rejected by the compiler because it's possible to infer a type of T from the first argument, but declaration func compute<T, U>(_: T = 42, _: U = []) where U: Collection, U.Element == Int is going to be accepted because T and U are independent.

Motivation

Interaction between generic parameters and default expressions is confusing when default expression only works for a concrete specialization of a generic parameter. It's possible to spell it in the language (in some circumstances) but requires boiler-plate code and knowledge about nuances of constrained extensions.

For example, let's define a Flags protocol and a container type for default set of flags:

protocol Flags {
  ...
}

struct DefaultFlags : Flags {
  ...
}

Now, let's declare a type that accepts a set of flags to act upon during initialization.


struct Box<F: Flags> {
  init(dimensions: ..., flags: F) {
    ...
  }
}

To create a Box , the caller would have to pass an instance of type conforming to Flags to its initializer call. If the majority of Boxes doesn’t require any special flags, this makes for subpar API experience, because although there is a DefaultFlags type, it’s not currently possible to provide a concretely typed default value for the flags parameter, e.g. (flags: F = DefaultFlags()). Attempting to do so results in the following error:

error: default argument value of type 'DefaultFlags' cannot be converted to type 'F'

This happens because even though DefaultFlags does conform to protocol Flags the default value cannot be used for every possible F that can be inferred at a call site, only when F is DefaultFlags.

To avoid having to pass flags, it's possible to "specialize" the initializer over a concrete type of F via combination of conditional extension and overloading.

Let’s start with a direct where clause:

struct Box<F: Flags> {
  init(dimensions: ..., flags: F = DefaultFlags()) where F == DefaultFlags {
    ...
  }
}

This init declaration results in a loss of memberwise initializers for Box.

Another possibility is a constrained extension which makes F concrete DefaultFlags like so:

extension Box where F == DefaultFlags {
  init(dimensions: ..., flags: F = DefaultFlags()) {
    ...
  }
}

Initialization of Box without flags: is now well-formed and implicit memberwise initializers are preserved, albeit with init now being overloaded, but this approach doesn’t work in situations where generic parameters belong to the member itself.

Let’s consider that there is an operation on our Box type that requires passing a different set of flags:

extension Box {
  func ship<F: ShippingFlags>(_ flags: F) {
    ...
  }
}

The aforementioned approach that employs constrained extension doesn’t work in this case because generic parameter F is associated with the method ship instead of the Box type. There is another trick that works in this case - overloading.

New method would have to have a concrete type for flags: like so:

extension Box {
  func ship(_ flags: DefaultShippingFlags = DefaultShippingFlags()) {
    ...
  }
}

This is a usability pitfall - what works for some generic parameters, doesn’t work for others, depending on whether the parameter is declared. This inconsistency sometimes leads to API authors reaching for existential types, potentially without realizing all of the consequences that might entail, because a declaration like this would be accepted by the compiler:

extension Box {
  func ship(_ flags: any Flags = DefaultShippingFlags()) {
    ...
  }
}

Also, there is no other way to associate default value flags: parameter without using existential types for enum declarations:

enum Box<F: Flags> {
}

extension Box where F == DefaultFlags {
  case flatRate(dimensions: ..., flags: F = DefaultFlags()) ❌ // error: enum 'case' is not allowed outside of an enum
}

To summarize, there is a expressivity limitation related to default expressions which could be, only in some circumstances, mitigated via constrained extensions feature, its other issues include:

Doesn’t work for generic parameters associated with function, subscript, or case declarations because constrained extensions could only be declared for types i.e. init<T: Flags>(..., flags: F = F()) where F == DefaultFlags is not allowed.
Methods have to be overloaded, which increases API surface of the Box , and creates a matrix of overloads if there are more than combination of parameters with default values required i.e. if dimensions parameter was to be made generic and defaulted for some box sides.
Doesn’t work for enum declarations at all because Swift does not support overloading cases or declaring them in extensions.
Requires know-how related to constrained extensions and their ability to bind generic parameters to concrete types.

Proposed solution

To address the aforementioned short-comings of the language, I propose to support a more concise and intuitive syntax - to allow concretely typed default expressions to be associated with parameters that refer to generic parameters.

struct Box<F: Flags> {
  init(flags: F = DefaultFlags()) {
    ...
  }
}

Box() // F is inferred to be DefaultFlags
Box(flags: CustomFlags()) // F is inferred to be CustomFlags

This syntax could be achieved by amending the type-checking semantics associated with default expressions to allow type inference from them at call sites in cases where such inference doesn’t interfere with explicitly passed arguments.

Detailed design

Type inference from default expressions would be allowed if:

The generic parameter represents either a direct type of a parameter i.e. <T>(_: T = ...) or used in a nested position i.e. <T>(_: [T?] = ...)
The generic parameter is used only in a single location in the parameter list. For example, <T>(_: T, _: T = ...) or <T>(_: [T]?, _: T? = ...) are not allowed because only an explicit argument is permitted to resolve a type conflict to avoid any surprising behavior related to implicit joining of the types.
1. Note: A result type is allowed to reference generic parameter types inferable from default expressions to make it possible to use the feature while declaring initializers of generic types or cases of generic enums.
There are no same-type generic constraints that relate a generic parameter that could be inferred from a default expression with any other parameter that couldn’t be inferred from the same expression. For example, <T: Collection, U>(_: T = [...], _: U) where T.Element == U is not allowed because U is not associated with defaulted parameter where T is used, but <K: Collection, V>(_: [(K, V?)] = ...) where K.Element == V is permitted because both generic parameters are associated with one expression.
The default expression produces a type that satisfies all of the conformance, layout and other generic requirements placed on each generic parameter it would be used to infer at a call site.

With these semantic updates, both the initializer and ship method of the Box type could be expressed in a concise and easily understandable way that doesn’t require any constrained extensions or overloading:

struct Box<F: Flags> {
  init(dimensions: ..., flags: F = DefaultFlags()) {
    ...
  }
  
  func ship<F: ShippingFlags>(_ flags: F = DefaultShippingFlags()) {
    ...
  }
}

Box could also be converted to an enum without any loss of expressivity:

enum Box<D: Dimensions, F: Flags> {
case flatRate(dimensions: D = [...], flags: F = DefaultFlags())
case overnight(dimentions: D = [...], flags: F = DefaultFlags())
...
}

At the call site, if the defaulted parameter doesn’t have an argument, the type-checker will form an argument conversion constraint from the default expression type to the parameter type, which guarantees that all of the generic parameter types are always inferred.

let myBox = Box(dimensions: ...) // F is inferred as DefaultFlags

myBox.ship() // F is inferred as DefaultShippingFlags

Note that it is important to establish association between the type of a default expression and a corresponding parameter type not just for inference sake, but to guarantee that there are not generic parameter type clashes with a result type (which is allowed to mention the same generic parameters):

func compute<T: Collection>(initialValues: T = [0, 1, 2, 3]) -> T {
  // A complex computation that uses initial values
}

let result: Array<Int> = compute() ✅
// Ok both `initialValues` and result type are the same type - `Array<Int>`

let result: Array<Float> = compute() ❌
// This is an error because type of default expression is `Array<Int>` and result
// type is `Array<Float>`

Source compatibility

Proposed changes to default expression handling do not break source compatibility.

Effect on ABI stability

No ABI impact since this is an additive change to the type-checker.

Effect on API resilience

All of the resilience rules associated with adding and removing of default expressions are left unchanged, see swift/LibraryEvolution.rst at main · apple/swift · GitHub for more details.

Alternatives considered

Default generic arguments feature mentioned in the Generics Manifesto should not be confused with type inference rules proposed here. Having an ability to default generic arguments alone is not enough to provide a consistent way to use default expressions when generic parameters are involved. The type inference described in this proposal would still be necessary allow default expressions with concrete type to be used when the parameter references a type parameter, and to determine whether the default expression works with a default generic argument type, which means that default generic arguments feature could be considered an enhancement instead of an alternative approach.

A number of similar approaches has been discussed on Swift Forums, one of them being [Pre-pitch] Conditional default arguments - #4 by Douglas_Gregor - Dis... which relies on overloading, constrained extensions, and/or custom attributes and therefore has all of the issues outlined in the Motivation section. Allowing type inference from default expressions in this regard is a much cleaner approach that works for all situations without having to introduce any new syntax or custom attributes.

Future Directions

This proposal limits use of inferable generic parameters to a single location in a parameter list because all default expressions are type-checked independently. It is possible to lift this restriction and type-check all of the default expressions together which means that if generic parameters is inferable from different default expressions its type is going to be a common type that fits all locations (action of obtaining such a type is called type-join). It’s not immediately clear whether lifting this restriction would always adhere to the principle of the least surprise for the users, so it would require a separate discussion if this proposal is accepted.

The simplest example that illustrates the problem is test<T>(a: T = 42, b: T = 4.2)-> T , this declaration creates a matrix of possible calls each of which could be typed differently:

test() — T = Double because the only type that fits both 42 and 4.2 is Double
test(a: 0.0) — T = Double
test(b: 0) — T = Int
let _: Int = test() - fails because T cannot be Int and Double at the same time.

xwu · February 24, 2022, 10:40pm

I've skimmed this text and have a vague sense of what is proposed, which seems to be reasonable. However, I can't be confident I understand fully.

Both the "Motivation" section and the "Detailed design" section go into a lot of detail, which is good, but they would benefit greatly from being structured in an inverted pyramid, with the most essential (and basic) points first and building up to the fullest details.

The "Introduction" should, ideally, be enough for an ordinary Swift user to understand at least in broad strokes what problem is being tackled and what the solution will be. However, without defining the terminology being used or giving examples, it's simply not possible for most readers to parse what's meant when you propose to "allow generic parameter inference from default expressions when generic parameters used in a parameter type cannot be inferred from any other parameter or through other generic parameters."

The "Proposed solution" section should be sufficient for readers who aren't trying to implement the feature (for which they'd see "Detailed design") to understand what's being proposed well enough to use the feature--at least the most essential parts even if they don't get all the nuances. As it is, if I read only the "Introduction" and "Proposed solution," I don't think I'd have the faintest clue what's going on.

xedin · February 24, 2022, 10:53pm

I have split Proposed Solution section in two with a code example. Does that help?

I wanted this sentence to define exactly when type inference is going to be allowed, open to suggests how to clarify it for regular users.

xedin · February 24, 2022, 11:14pm

How about this instead:

Today, it's difficult to use default parameter values in generic functions because most default values have a concrete type. I propose to allow type inference for generic parameters from concretely-typed default parameter values (referred as default expressions in the proposal) when the call-site omits an explicit argument. Type inference would be allowed when generic parameters associated with a defaulted parameter cannot be inferred from any other location in a parameter list, or through other generic parameters i.e. via same-type constraints.

hborla · February 24, 2022, 11:23pm

It's surprisingly difficult to come up with a concise summary of this feature for the introduction - Pavel and I just spent the last 30 minutes bike shedding it over DM and still haven't come up with something totally satisfactory would love to get your feedback on the alternative that Pavel just posted.

The terminology required to describe this feature is also some of the hardest terminology to nail down because each term has a lot of nuance. Everyone knows the eternal struggle of "parameter" versus "argument". Throw the type-level variant -- which is crucial to this proposal -- into the mix and now you have "parameter" versus "argument" and "generic parameter" versus "generic argument".

I personally have always called the subject of this proposal "default arguments" because they turn into arguments at the call-site, and that's an important part of the mental model, but there are a bunch of different ways they are referred to elsewhere. TSPL calls them "default values", and the definition also seems to conflate "parameter" and "argument".

You can define a default value for any parameter in a function by assigning a value to the parameter after that parameter’s type. If a default value is defined, you can omit that parameter when calling the function.

If anyone has specific suggestions on how to clear up the terminology in the proposal, I'd love to hear them. Even with definitions, the terminology can be really hard to understand.

xwu · February 25, 2022, 4:09am

I think examples would be best here, particularly if they're used carefully to build up the terminology.

I don't really understand this sentence. That most default values have a concrete type seems pretty self-evident, but I cannot parse why that fact is the cause of any difficulty. I think you really need to illustrate the problem here concretely with an example.

This sentence is crystal clear to me.

I can understand what you're saying by this, but when parsed literally it is confusing: At first blush it looks like you are saying that type inference (of any sort) would be disallowed where type inference is currently possible. This is, of course, not at all the case; you are writing about where inferring type specifically from default value expressions, and it isn't really about allowing or disallowing anything but rather its relative priority compared to other sources of information for type inferencing.

In other words, as I understand it, what you're saying is something like: "Type inference for a generic parameter using the type of a default expression will only take place where other sources of information currently used for type inference--such as same-type constraints or another parameter of the same type--aren't available. [Insert example here.]"

xedin · February 25, 2022, 4:27am

The difference here is that this is talking about declarations, so the change of language makes it seem like declarations with same-type constraints and concretely-typed default expressions are going to be accepted, but they are not i.e. <T, U>(_: U, _:T = 42) where U.Element == T is not going to be accepted by the compiler.

Maybe that sentence should be changed to:

Concretely-typed default expressions would still be rejected by the compiler if generic parameters associated with a defaulted parameter could be inferred at call site from any other location in a parameter list by an implicit or explicit argument. For example, declaration func compute<T, U>(_:T = 42, _: U) where U: Collection, U.Element == T is going to be rejected by the compiler because it's possible to infer a type of T from the first argument, but declaration func compute<T, U>(_: T = 42, _: U = []) where U: Collection, U.Element == Int is going to be accepted because T and U are independent.

Edit: After re-thinking this I think that the biggest problem with the previous sentence is that it inadvertently mixed call and declaration site together, so I think it makes sense to clarify use of inference.

xwu · February 25, 2022, 4:59am

xedin:

The difference here is that this is talking about declarations, so the change of language makes it seem like declarations with same-type constraints and concretely-typed default expressions are going to be accepted, but they are not i.e. <T, U>(_: U, _:T = 42) where U.Element == T is not going to be accepted by the compiler.

Maybe that sentence should be changed to:

Concretely-typed default expressions would still be rejected by the compiler if generic parameters associated with a defaulted parameter could be inferred at call site from any other location in a parameter list by an implicit or explicit argument. For example, declaration func compute<T, U>(_:T = 42, _: U) where U.Element == T is going to be rejected by the compiler because it's possible to infer a type of T from the first argument, but declaration func compute<T, U>(_: T = 42, _: U = []) where U: Collection, U.Element == Int is going to be accepted because T and U are independent.

OK, I have to admit then that I do not understand. I cannot parse why the first example will be rejected based on this description (other than that it is missing U: Collection, which I assume was a typo and not actually the point you’re trying to make), and I can’t understand (independently of what is written) why it’s the case in a way where I can help with the writing. Perhaps (if it’s not just me—certainly possible) then this reflects something more fundamental with the teachability of this feature as currently designed and scoped.

xedin · February 25, 2022, 5:13am

Yes, sorry I forgot about U: Collection. The first example is rejected because it would be impossible to verify that default expression matches the requirements of T since default expressions are type-checked when the declaration they belong to is type-checked instead of being implicitly inserted at call sites and type-checked together with other arguments when the overload that contains it is chosen.

xwu · February 25, 2022, 5:15am

Got it! I have to say, this is incredibly subtle (IMO)…

xedin · February 25, 2022, 5:18am

Yes, indeed! If you have any suggestions how to express it better I’d love to hear it

xwu · February 25, 2022, 5:22am

I think your longer explanation is great! It makes a lot of sense.

That said, my worry is that it is really too subtle an explanation to be useful for users in practice—that is, there wouldn’t be much point for someone who wants to just use the proposed feature to actually reason through what you’ve explained: they could just as well attempt to write what might be forbidden, then mash the fix-it button (hopefully there will be one) in cases where the compiler complains. And for me that raises the question of why the compiler couldn’t just DWIM here.

DevAndArtist · February 25, 2022, 5:29am

Is that a typo as DefaultFlags does conform to Flags?!

I only quickly scanned the text so I apologize if I missed some point.

Can we also support this form?

struct Box<F: Flags> {
  var flags: F = DefaultFlags()
}

The init would be compiler generated here.

xedin · February 25, 2022, 9:06am

It sounds like I need to re-phrase that sentence. Even though DefaultFlags does conform to Flags it cannot be used for every possible F that can be inferred at a call site, only when F is DefaultFlags.

xedin · February 25, 2022, 9:07am

I don’t think we can without default generic arguments feature, I talk about that in Alternatives Considered.

xedin · February 25, 2022, 9:10am

I don’t think there will be a fix-it, we just need to get a clear message across why it doesn’t work. One idea I had is an educational note for each situation when declaration is rejected.

stephencelis · March 1, 2022, 5:34pm

A few folks have asked for more examples of where this pitch could be applied, so we'd like to provide a few concrete examples from a couple of our libraries, which would benefit from this pitch.

One of the more straightforward examples comes from our incremental parsing library, swift-parsing. We have a lot of parser types that come with a lot of initializer overloads just to specify some defaults. The Many parser is a prime example of this. It is a parser that can run a given parser many times on a string and accumulate all the results. It can optionally run a "separator" parser between each invocation, as well as a "terminator" parser on completion. If the separator and/or terminator parsers are omitted, we default to a parser that simply does nothing. We currently have 8 (!) initializers defined to support these permutations across 2 more general initializers (4*2), starting here.

With this pitch, we could bring things down to just 2 initializers with default separators and terminators specified:

public init(
  into initialResult: Result,
  atLeast minimum: Int = 0,
  atMost maximum: Int = .max,
  _ updateAccumulatingResult: @escaping (inout Result, Element.Output) throws -> Void,
  @ParserBuilder element: () -> Element,
  @ParserBuilder separator: () -> Separator = { Always(()) },
  @ParserBuilder terminator: () -> Terminator = { Always(()) }
) { … }

public init(
  atLeast minimum: Int = 0,
  atMost maximum: Int = .max,
  @ParserBuilder element: () -> Element,
  @ParserBuilder separator: () -> Separator = { Always(()) },
  @ParserBuilder terminator: () -> Terminator = { Always(()) }
) { … }

Another example comes from our swift-composable-architecture, library, where we define SwiftUI view helpers that can be configured with other views at initialization, and we provide overloads with default views.

Here we have an IfLetStore view, which safely unwraps an observable object's optional state or falls back to some default. We also have an initializer that effectively defaults the else branch to EmptyView:

public init<IfContent, ElseContent>(
  _ store: Store<State?, Action>,
  @ViewBuilder then ifContent: @escaping (Store<State, Action>) -> IfContent,
  @ViewBuilder else elseContent: @escaping () -> ElseContent
) where Content == _ConditionalContent<IfContent, ElseContent> { … }

public init<IfContent>(
  _ store: Store<State?, Action>,
  @ViewBuilder then ifContent: @escaping (Store<State, Action>) -> IfContent
) where Content == IfContent? { … }

If I understand correctly, this pitch would allow us to capture this work in a single initializer instead:

public init(
  _ store: Store<State?, Action>,
  @ViewBuilder then ifContent: @escaping (Store<State, Action>) -> some View,
  @ViewBuilder else elseContent: @escaping () -> some View = { /*EmptyView()*/ }
) { … }

xedin · March 1, 2022, 8:33pm

Thank you for the examples! All of the cases you have mentioned are going to work, the first example which requires multiple overloads due to combination of separator: and terminator: parameters is a great a show-case for the ergonomics improvements here which I also mentioned briefly in the motivation section.

Nobody1707 · March 1, 2022, 8:36pm

Not that this is anywhere near as big of a problem as your example, but would this allow something like:

protocol PsuedorandomNumberGenerator: RandomNumberGenerator {
    // other stuff that we can't quite specify yet
    init<Source: RandomNumberGenerator>(from source: inout Source = &SystemRandomNumberGenerator())
}

Because I've wanted this since the Random Unification.

xedin · March 1, 2022, 8:57pm

inout parameters couldn't be defaulted under existing semantic rules and this is not changing under proposed rules.