SE-0341: Opaque Parameter Declarations

ensan-hcl · February 16, 2022, 9:57am

I replied to the wrong address. This is a reply to @bjhomer.

I am one of people who feel "callee-constructed"-ness is essential to some. In my case, this feeling comes from the strangeness to have two different equal-ness in the same language.
As we all know, generic parameters and generic result types are similar features. Then, generic parameters and 'reverse-generic' result types will seem to be similar features, because they share the same some syntax.
As a result, we will have two different identity relating to generics in Swift, if we introduce proposed feature. One is whether it is generic or reverse-generic, and the other is whether it is some or not. We don't usually need two different viewpoints, but there are two.
If I consider "callee-constructed"-ness is essential to some, then there is only one sameness in Swift. This is much more natural for me.

tera · February 16, 2022, 10:00am

When users learn that the advanced angle bracket syntax can be used for some parameter types they would try to apply that advanced angle bracket syntax for some result type and would be confused why they can't.

Tino · February 16, 2022, 12:14pm

I somehow got reminded on something that has been posted long ago:

Staying in this analogy, I lately got a strong feeling that there are some bricks which Core does not like anymore, but instead of searching or creating new stones, they try to reshape everything with lots of mortar…

I'm not opposing this particular proposal and don't think it will change Swift significantly, but small things add up, and it looks unavoidable that the language becomes more and more complex and less elegant :-(.

hooman · February 16, 2022, 9:33pm

My justification will need a full sized article and in the end, it will be just my opinion. Tbh, I don't have time to get into this discussion. Let's assume it is my gut feeling without justification and leave it at that.

Douglas_Gregor · February 16, 2022, 10:21pm

I see this proposal, SE-0309 (unlocking existentials), SE-0335 (existential any), and the lightweight syntax as all pieces of the same puzzle: both existential types and opaque types are getting more general and are growing the same syntax for representing constraints. We don't want one monolithic proposal, and I don't think it makes sense to try to build up a tower of conditional-accepts either. Rather, I think we treat this as one piece in a larger story: by itself it generalizes some in a natural way and makes a few things cleaner, and with more of the puzzle pieces in place it becomes more valuable.

I suppose it could, although it would be a lot easier for us to go refactor the code to make it nicer.

I had completely forgotten this! And yes, it would be much nicer to write out the some Sequence<Element> here in autocomplete.

If you haven't done so before, I recommend reading the Rust RFC for a similar feature, especially the section on learnability. I think this use of some lines up with programmer's intuition about arguments vs. return values, and who gets to choose those values, which extends naturally to types.

Doug

patrickgoley · February 16, 2022, 10:22pm

Ok. Let’s assume you’re right and everyone learns generics via Swift, all the more reason to get them right! Maybe you've got some feedback on the critiques above that don't involve writing a full article? Your input is valuable and would be much appreciated.

Yes, I think we should use different names for different concepts and I would be in favor of this syntax. Great suggestion.

I think "reverse generics" is a total misnomer and creates significant cognitive dissonance. See my points above about how opaque return types cannot, in any way, be rewritten as a generic type parameter. The return type is strictly not generic, it's just hidden. Once you've selected the return type (as the function author), it will only ever be that one type unless you choose to refactor it's internals. When you write a generic function, the type variables can take on any type that fit the constraints depending on how they are applied by the caller. That is what makes a function generic.

Also, the types of generic arguments are always unknown to the function author, even with the existing syntax. This proposal does nothing to make them more opaque than they already are. I think a better name for this proposal would be Anonymous Type Variables. It would allow you to create a generic type inline without having to declare it beforehand in the angle brackets, which is pretty cool, but I think not similar at all to the existing idea of some.

Ben_Cohen · February 16, 2022, 10:28pm

Can you define what in your view "generic" means, such that you can state opaque return types are strictly not generic?

patrickgoley · February 16, 2022, 11:08pm

Happy to do so, and please correct me if my understanding is incorrect. I'm hoping it is incorrect because I'm really confused by the notion that opaque return types are considered generic.

My understanding of generics, thanks for reading

Generic functions (or structs, etc) contain type variables that act as a placeholder that can be filled in with a real, concrete type that fits the constraints after the fact and which are unknown to the function author. Generics are sort of like a template. Depending on what types are supplied to the type variables (either within the author's module or by users of the module), the compiler can and will generate copies of the function (or struct, etc), with those selected types filled in. This is called monomorphization, and it's great because it allows polymorphism without dynamic dispatch (but at the cost of code size).

Opaque return types (in my understanding) do not create type variables that can participate in monomorphization. They are compiled to have a single, unchanging return type which is known at compile time of the module, even if the function is being exported in a library where it might be called by others.

Generic functions, on the other hand, must maintain their type variables after compilation (at least when publicly exported). This is so that they can be further monomorphized by users compiling their code against the library with the generic function. Otherwise we'd be limited to exporting functions that only use existentials which have performance drawbacks (eg dynamic dispatch).

To give an example, here's a function with an opaque return type:

func makeString() -> some StringProtocol {
    return "hello"
}

This can (and in my understanding, will) be compiled down to:

func makeString() -> String {
    return "hello"
}

with the caveat that some StringProtocol is preserved in the module's interface, as to not expose the actual static type, avoiding dynamic dispatch and API fragility. But under the hood, (namely at the SIL layer, I think) it really is just a function that returns a String (doesn't sound very generic to me, but who knows?).

Compare that (to what I think of) a generic function that has an explicit type variable:

func printDescription<T: CustomStringConvertible>(value: T) {
    print(value.description)
}

Perhaps if this function is only used internal to the module it's defined in, it can be monomorphized to all of its use cases, the type variables are eliminated and the original function disappears. But, if the function is export in the module's public interface, it can't be compiled away, because users of the library may want to supply their own custom types to the type variable T and generate additional monomorphizations.

That distinction is what, in my mind, makes opaque return types distinct from generics. Please do provide any insight if I'm off base, thanks in advance!

patrickgoley · February 16, 2022, 11:19pm

Thanks Doug, the Rust RFC was helpful, specifically this line regarding impl Trait

If you pick the value, you also pick the type

That gives me an intuition of what some means, both in parameter and return positions. However! This is a departure from true (or at least classical / historical) generics, where the type variables are always filled in by the caller (both in Swift and many other languages) and can vary depending on the context. So the fact that we're moving away from that notion is what makes me feel like some (at least in the return position) is tangential but not equivalent to generics.

Maybe someone who is new to programming will not be surprised by this and simply adopt the intuition found in the quote above. But to me, it's very surprising that we consider some in the return position generic, where the type is supplied by the callee, and limited to a single known (to the compiler) type that does not and cannot vary after the fact.

tera · February 17, 2022, 1:01am

That. Plus from a pure lexical standpoint, if these two are considered equivalent:

func foo<T: P>(param: T) { ... }
func foo(param: some P) { ... }

users will rightfully assume these should be also equivalent:

func foo() -> some P {}
func foo<T: P>() -> T {}

and be very much surprised to learn they are not!

A different keyword can work here and make it mentally (and otherwise) unambiguous:

func foo<T: P>(param: T) { ... }
<--->
func foo(param: generic P) { ... }

func foo<T: P>() -> T {}
<--->
func foo() -> generic P { ... }

and another thing of its own kind with no angle bracket innuendo:
func foo() -> some P {}

Alternatively...

Alternatively something like this (with no extra keyword):

func foo<Apple: Comparable>(param: Apple) { ... }
<--->
func foo(param: Comparable Apple) { ... }

func foo<Apple: Comparable>() -> Apple {}
<--->
func foo() -> Comparable Apple { ... }

This latter form would allow repeating Apple more than once, allowing more advanced use cases.

func foo(apple: Comparable Apple, apple2: Apple) { ... }
    <--->
func foo<Apple: Comparable>(apple: Apple, apple2: Apple) { ... }

xwu · February 17, 2022, 1:31am

In the general case, the underlying return type is unknown to the compiler of the caller and it can vary.

To develop the intuition more fully, consider:

Given func f() -> some P and func g(_: some P)—aka func g<T: P>(_: T)—you can call g(f()), supplying the outer function with the result of the inner function without knowing the type.

Douglas_Gregor · February 17, 2022, 1:34am

This is precisely why deeper discussions of some in return positions call refer to reverse generics: the principle that there is a type variable that can be reasoned about abstractly but is not concretely known remains the same, but who chooses the type variable changes between generics (caller) and reverse generics (callee).

Doug

Paulo_Faria · February 17, 2022, 1:37am

tera:

That. Plus from a pure lexical standpoint, if these two are considered equivalent:
func foo<T: P>(param: T) { ... }
func foo(param: some P) { ... }
users will rightfully assume these should be also equivalent:
func foo() -> some P {}
func foo<T: P>() -> T {}
and be very much surprised to learn they are not!

I agree that I would also be confused by this.

Ben_Cohen · February 17, 2022, 2:10am

To be fair, my question was a bit of a trap because generics is not as well defined a thing as you make out. It means different things in different languages and my question was prompted by your saying that "the return type is strictly not generic" which is quite a bold statement about what is really not so rigorously defined bit of terminology.

If we look for definitions out there, Stepanov's paper starts with:

Generic programming centers around the idea of abstracting from concrete eficient algorithms to obtain generic algorithms that can b e combined with different data representations to produce a wide variety of useful software. For example, a class of generic sorting algorithms can b e defined which work with finite sequences but which can be instantiated in different ways to produce algorithms working on arrays or linked lists.

But going by that definition alone, you could just say the Java 1.0 List interface is generic. But no-one thinks of it that way. Here's another paper What is Generic Programming? that gets us closer to what we think of as generics in Swift:

In the simplest view generic programming is equated to a set of language mechanisms for implementing type-safe poly- morphic containers, such as List in Java. The notion of generic programming that motivated the design of the Standard Template Library (STL) advocates a broader defi- nition: a programming paradigm for designing and developing reusable and efficient collections of algorithms.

This is in my mind what distinguishes generics in Swift from the alternative of type-erasing polymorphism you can achieve with existentials (or classes) where you have a variable that can hold (or point to) any one of a number of different types. Given this, some Foo in both parameter or return position is very much in the generics camp. It is type preserving, not type erasing. That does not mean you get to know what the type is. Inside a function, you are passed some (specific) Foo, but must write code that works for all Foo. Outside a function, you are passed back some (specific) Foo and must write code that works for all Foo.

You rightly point out that this opens up some opportunities for performance optimization. But a lot of what you lay out is not really what defines generics, but is more like implementation details of some generics systems, and some of it is not true of Swift. Swift does not perform full monomorphization like C++ or Rust: Swift only specializes functions as an optimization, and unspecialized generics in Swift still rely on dynamic dispatch through witness tables to achieve their polymorphism. But nor does it require everything to be a pointer like Java/ObjC: an array of [some Foo] will hold all the elements contiguously inline because they're all known to be the same type, instead of boxing them like [any Foo] does. And specialization is not specific to generics either. Swift also has an existential specializer, which can specialize a function that takes an existential to take a specific type when it can directly see a concrete type is being passed in.

Finally it might also help expand your definition of what is generics to know that in your example:

patrickgoley:

here's a function with an opaque return type:
func makeString() -> some StringProtocol {
    return "hello"
}
This can (and in my understanding, will) be compiled down to:
func makeString() -> String {
    return "hello"
}

When a function returning an opaque type is not inlinable, i.e. the caller cannot see it, then the calling compiler cannot assume the function returns a String. What's more, if the function is in an ABI-stable library, what value it returns can change over time. Without recompiling the caller, the library can be swapped out (e.g. by updating the OS) and the caller can receive an entirely different type to before. The caller can handle this because the caller is written to be generic over any StringProtocol, using the exact same mechanism that allows a pre-compiled ABI-stable generic function to take any specific type implementing StringProtocol as an argument. Hence the term reverse generics.

hborla · February 17, 2022, 2:49am

I don't think this is a given. The advanced programmers participating in this discussion thread might easily understand a function that looks like func foo<T: P>() -> T, but inputting a type parameter without a value, and returning a value of that type by, usually, initializing it through a protocol initializer or static method requirement is a fairly advanced thing to do in code. It's also very unclear to Swift newcomers how to supply a generic argument to such a function - this must be done via contextual type or coercion. It's becoming increasingly more common to declare such a function as func foo<T: P>(type: T.Type) -> T, because it's a lot more obvious how to call this function and supply a generic argument.

I don't think func foo<T: P>() -> T is something worth sugaring, nor do I think the some P syntax would be usable for this API pattern. If the return value is initialized via static initializer or method call on the type parameter, the code would need to use leading dot syntax due to the inability to reference the type parameter declared by some P - at that point, it might be more readable to just name the type parameter. If you're trying to return some other value dynamic casted to that type, you need to name the type parameter anyway.

If I encountered a function returning some P, I also wouldn't find it intuitive to need to coerce the return value to a concrete type in order to call that function. Perhaps that's because I've already internalized the opaque return type model, but I've also been programming in Swift for several years and have never found input type parameters used only in return position very intuitive nor easy to use. Personally, I find the "input" and "return" type parameter model proposed here much more intuitive and useful based on how input and return values are used in practice. I also very much like how easy it becomes to turn a function accepting and returning existential types, e.g. func f(value: P) -> P, into one accepting and returning opaque types if the capabilities of existential types are not actually needed. For this reason, I also believe the proposed model will be more intuitive to programmers who have relied on subtype polymorphism as their primary abstraction tool before embracing Swift's generics system.

Douglas_Gregor · February 17, 2022, 2:50am

tera:

Plus from a pure lexical standpoint, if these two are considered equivalent:
func foo<T: P>(param: T) { ... }
func foo(param: some P) { ... }
users will rightfully assume these should be also equivalent:
func foo() -> some P {}
func foo<T: P>() -> T {}
and be very much surprised to learn they are not!

And as a corollary, a user sees

func f(i: Int) { }

where the caller gets to provide the value for the Int but

func g() -> Int { }

and rightfully assumes that the caller gets to decide what the value is?

Values flowing into function parameters and out the return type is the natural way we think of functions. A function like this foo you argue for above

is most likely a terrible API. A better API design for this thing that is parameterized over T and returns a T would have a parameter where one specifies the type as one of the parameters:

`func betterFoo<T: P>(type: T.Type) -> T {}`

You see this in APIs like unsafeBitCast(_:to:) because relying on backward type inference is poor API design.

So, I'm having trouble with this hypothetical person that is using some P in the return type, misinterpreting it as generics, and manages to go more than a few minutes further into their confusion without bumping into the type checker and learning the proper interpretation.

And I am having a lot of trouble justifying the introduction of a third completely different syntax (generic T) in the hope that it will save that person those few minutes, because there's a much higher cognitive overload to new keywords than there is to making existing concepts work in new places, so long as there is a decent reason why it's the same concept. We have that explanation already in this proposal.

Doug

tera · February 17, 2022, 3:30am

On that particular note:

why is it terrible?
If there's an agreement it is terrible, should that possibility (type inference by return value) be taken out of Swift?

Ben_Cohen · February 17, 2022, 4:41am

This is a specific and extreme case of the broader idea that overloading only by return type, requiring type context to disambiguate, is accepted as bad practice when designing Swift APIs.

But just because you can compose some obviously bad code from basic constructs doesn't mean we need to go around special casing and forbidding them. You wouldn't write while true { }, or a function with a hundred parameters, but we don't ban it.

Ben_Cohen · February 17, 2022, 4:42am

Review Conclusion

The review period has ended and the proposal is accepted.