[Pitch 2] Light-weight same-type requirement syntax

stephencelis · February 22, 2022, 12:49am

While I liked @ktoso's mention of "type parameters," and thought "associated type parameters" might contrast nicely with "generic parameters", I've actually come around to think "primary associated types" feels really right.

I think any initial negative reaction to "primary" may have been due to conflating it with the original pitch, which arbitrarily limited the feature to a single type, and maybe emphasized the potential singularness of the word in my mind and the minds of others. When the limitation was removed, the idea of a protocol having "multiple primary associated types" sounds great to me.

I also don't think any synonym fits better or is as searchable when learning about the topic.

hborla · February 22, 2022, 4:13am

As promised:

Slava_Pestov · February 22, 2022, 4:15am

The downside of introducing a new meaning of "type parameter" is that this term is already used, at least internally in the implementation, to mean "generic parameter or a nested type of another type parameter". So in the generic signature <T : Collection>, the type parameters are T, T.Element, T.SubSequence, T.SubSequence.Element, etc.

xwu · February 22, 2022, 4:17am

What about (by contrast with "generic parameters") just "associated type parameters" (parsed not as "associated (type parameters)" but "associatedtype parameters")?

ktoso · February 22, 2022, 4:20am

Nice idea, @xwu

Not going to argue too strongly for the "type parameter" thing, but to me it really shows what they do and the similarity to "generic parameter" is somewhat useful -- there'd be:

"parameters" - ok, just function parameters,
"generic parameters" ok, generics, I know that, the <T> on functions and concrete types, and
"associatedtype parameters" okey... so like associated types but like generic parameters... so on protocol declarations, and it's spelled similar to <T> which I know from generics, okey

Anyway, just my 2c about how one might learn those things. Up to folks deeper in the type system to say if it makes sense or not!

stephencelis · February 22, 2022, 4:47am

This is what I was trying to get at, but I can see how it muddies existing terminology when not read just right. That's exactly why I think "primary associated types" is actually much better, especially when it comes to having a more searchable term, which will aid in learnability in the long run.

hisekaldma · February 22, 2022, 11:05am

I think that would be helpful!

I'm not so sure you haven’t! The feature flag to enable this is -enable-parameterized-protocol-types, and that sounds much better to me than "protocols with primary associated types". If the overarching goal here is to help users take the step from Array<Int> to Collection<Int>, we really should call the things within the brackets "parameters" in both cases, or we’re making that step harder than it needs to be.

So let me throw "protocol parameter" as contrast to "generic parameter" into the hat. Ben’s observation above then becomes simply "protocol parameters will typically match the generic parameters of generic types that conform to the protocol". That has a very nice ring to it.

DevAndArtist · February 24, 2022, 10:18am

Some thought on why I still prefer a marker on the declaration side.

Consider an evolving protocol:

protocol AsyncSequence {
    associatedtype AsyncIterator: AsyncIteratorProtocol
    associatedtype Element where Self.Element == Self.AsyncIterator.Element
    func makeAsyncIterator() -> Self.AsyncIterator
}

protocol AsyncIteratorProtocol {
    associatedtype Element
    mutating func next() async throws -> Self.Element?
}

After primary associated types become a thing, the author of the protocol decides to extend this functionality with that feature.

Does the author need to #if around the entire protocol? Let's assume so.

#if condition
protocol AsyncSequence<Element> where Element == AsyncIterator.Element  {
    associatedtype AsyncIterator: AsyncIteratorProtocol
    func makeAsyncIterator() -> Self.AsyncIterator
}
#else 
// old version
#endif

protocol AsyncIteratorProtocol {
    associatedtype Element
    mutating func next() async throws -> Self.Element?
}

Okay users of that protocol start to write some AsyncSequence<Element> everywhere.

Great, let's consider that those protocols could be extended with a Failure: Error associated type in the future where Swift has typed throws. Similarly to how some already used the some Publisher<A, ConcreteError> example I would expect something similar to happen with AsyncSequence.

Do we need to do yet another #if dance here?

#if condition
// can properly mark this extension at all?
protocol AsyncSequence<Element, Failure: Error>
                                ^~~~~~~~~~~~~~ is this extension even legal, ABI compatible?
  where 
  Element == AsyncIterator.Element,
  Failure == AsyncIterator.Failure
{
    associatedtype AsyncIterator: AsyncIteratorProtocol
    func makeAsyncIterator() -> Self.AsyncIterator
}

protocol AsyncIteratorProtocol {
    associatedtype Element
    associatedtype Failure: Error
    mutating func next() async throws<Failure> -> Self.Element?
}
#else

#if previous_condition
// previous version
#else
// old version
#endif

protocol AsyncIteratorProtocol {
    associatedtype Element
    mutating func next() async throws -> Self.Element?
}
#endif

While we're at it: Does this break existing code such as some AsyncSequence<Element>, because of the sudden requirement of a secondary primary associated type?

I think a pure marker on the associated type wouldn't suffer from all this gigantic #if dance.

protocol AsyncSequence {
    @mark_marker_availabitlity
    marker   
    associatedtype Element where Self.Element == Self.AsyncIterator.Element

    @available(...)
    associatedtype Failure: Error where Self.Failure == Self.AsyncIterator.Failure

    associatedtype AsyncIterator: AsyncIteratorProtocol
    func makeAsyncIterator() -> Self.AsyncIterator
}

protocol AsyncIteratorProtocol {
    associatedtype Element
    @available(...)
    associatedtype Failure: Error
    mutating func next() async throws<Failure> -> Self.Element?
}

Request to amend `AsyncSequence`

So even if we had today's definition of AsyncIteratorProtocol , we could extend it later [*]:
@rethrows public protocol AsyncIteratorProtocol {
  associatedtype Element

  @available(Swift 5.6 or whatever)
  associatedtype Failure: Error = /*Never if the conforming type's next() is non-throwing, Error otherwise */

  mutating func next() async throws -> Element?
}
[...]

[*] There is one bit of metadata we'll need to record for rethrowing protocol conformances to make the defaulting work properly. It's not a big deal.

That's where this example originated. I also would like to know if it's considered a breaking change to exposing a set of primary associated types and adding another one in another library iteration.

// today
protocol P {
  primary associatedtype A
  associatedtype B
}
some P<ConcreteA>

// future
protocol P {
  primary associatedtype A
  primary associatedtype B
}

// does this break?
some P<ConcreteA>

It seems to me that if the some P<ConcreteA> part was written with a non-sugar general form, it would still function just fine, however the special 'primary' syntax seems to lead us into a 'require and break' corner.

Please provide clarifications on that.

Slava_Pestov · February 24, 2022, 6:15pm

You need an if dance either way: with primary associated types declared at the top of the protocol with a "generic parameter list", you need if around the entire protocol. With a 'primary' keyword or attribute, you only need if around the primary associated types in the protocol body.

Adding a new associated type to a protocol is legal as long as it has a default (otherwise, it is source and binary breaking since existing conforming types don't have a witness). So we need a syntax like so if we go with the "generic parameter list":

protocol AsyncSequence<Element, Failure: Error = MyDefaultError> {...}

The proposal and implementation as written allows you to specify zero, one or more primary associated types when referring to the protocol. So some AsyncSequence, some AsyncSequence<Int> and some AsyncSequence<Int, MyError> would all be valid with your example.

Slava_Pestov · February 24, 2022, 6:18pm

So to clarify:

Adding a new associated type is binary and source compatible, as long as it has a default
Making an existing associated type primary is binary compatible; also source compatible as long as it's added at the end of the list
Making an existing primary associated type non-primary is binary compatible but source breaking
Re-ordering primary associated types is binary compatible but source incompatible
Re-ordering non-primary associated types is binary compatible and source compatible
Removing an associated type entirely is binary incompatible and source incompatible

hborla · February 24, 2022, 6:22pm

Note that you only need to do the #if dance if the library supports compiler versions that cannot parse the primary associated type syntax. So, if a library compatible with the Swift 5.6 compiler adopts primary associated types and then decides to add another primary associated type later, the library does not need to add another #if condition for the second primary associated type.

Jumhyn · February 24, 2022, 6:44pm

I would really like to see this handled in a way that doesn't require duplicating the entire protocol body. IMO that imposes a pretty high maintenance burden on library authors to keep the different versions in sync. But I'm also not responsible for maintaining such a library so perhaps I am overreacting?

Is there a way to specify just the second of two primary associated types? It seems a bit strange to me to impose a de facto hierarchy on the primary associated types based on order. (Would placeholder types allow this to 'just work' as some AsyncSequence<_, MyError>?)

Jon_Shier · February 24, 2022, 6:54pm

As a maintainer of a library, I certainly wouldn't want to support such a bifurcation. I don't know if it's valuable enough to immediately drop older compiler support. Alamofire's ResponseSerializer could take advantage of it, but so few people use anything other than the built in serializers I don't know if it's that important to support quickly.

(If Apple wants us to jump to newer Swift versions faster, they need to support older macOS versions longer.)

Slava_Pestov · February 24, 2022, 8:14pm

That should just be done with the full where clause syntax. Remember that the primary associated types feature is not intended to replace where clauses entirely; you still need them for more complex requirement specifications.

This could be made to work as long as primary associated type constraints are only valid in generic requirement position, but it introduces an ambiguity as soon as we allow primary associated types constraints on any for the types of values; the placeholder means "infer this from context", not "leave this unspecified". That is,

let a: Array<_> = [1, 2, 3]

infers the type of a as Array<Int> from the expression, it doesn't erase the element type to give you a hypothetical <T> Array<T> existential. Similarly, you would expect that

let a: any Sequence<_> = [1, 2, 3]

would infer the type of a as any Sequence<Int>, not any Sequence with Element erased.

Slava_Pestov · February 24, 2022, 8:17pm

If it is any consolation, primary associated types do not require any runtime support nor do they introduce new ABI, so as long as you can use the new compiler you can still backward deploy code that uses the feature to older platform versions.

Jon_Shier · February 24, 2022, 8:48pm

That doesn’t really matter when people can run the version of Xcode required. macOS is technically Swift’s least supported platform, as far as versions and actually being able to ship software go.

John_McCall · February 24, 2022, 9:53pm

I don’t think it’s reasonable to accept minimizing #if complexity as an ongoing factor on language evolution. I’d be interested in knowing if there are other ways we can address this backward-compatible source library use case, though. In particular, when we’re printing module interfaces, we do have logic to emit #if conditions to allow the interfaces to be parsed by older tools. That logic isn’t perfect, but it might be a foundation for doing the same rewrite to arbitrary source. So we could have a tool that does a source-to-source translation and redundant emissions necessary to make code interpretable by older compilers. Of course, maintainers would then have to actually run that tool when packaging their library for distribution.

gwendal.roue · February 24, 2022, 10:30pm

I'm sorry, but I do not understand. What would library maintainers have to do, for which purpose?

John_McCall · February 24, 2022, 10:38pm

I mean that we could make a source tool that turns e.g.

protocol Translator<Input, Output> {
  func translate(_: Input) -> Output
}

into:

#if $ProtocolPrimaryAssociatedTypes
protocol Translator<Input, Output> {
  func translate(_: Input) -> Output
}
#else
protocol Translator {
  associatedtype Input
  associatedtype Output
  func translate(_: Input) -> Output
}
#endif

for the purposes of supporting source library maintainers who want to support generating versions of their libraries that work in older versions of the compiler. Basically, a new version of the compiler would compile the library into source that can be compiled by older compilers.

Of course, maintainers would then have to run that tool in order to publish versions of their library instead of just having clients check out a tag of their repository. And they would also want to test that the output actually worked on older tools, but that's presumably not a new requirement.

The advantage is that, assuming the tool works, you get to just write code to the latest version of the compiler without having to manually maintain redundant declarations or whatever other #if complexity is necessary to support older compilers. The disadvantages are that you need the tool to exist and you need a sort of compilation phase to distribute backward-compatible versions of your library.

Ben_Cohen · February 24, 2022, 11:08pm

The package manager is also growing more support for custom build steps and build plugins, which might make custom preprocessing to strip primary associated types a tolerable option.