[Pitch] Primary Associated Types in the Standard Library

Hi all,

While SE-0346 is converging on a nice design, I think it's time to start discussing what associated types should be marked primary throughout the Standard Library.

To get things started, I've prepared a draft proposal (direct Markdown link, evolution PR) and an initial implementation that adds primary associated type declarations to 20 existing stdlib protocols.

The key part is this table; it lists all public protocols in the Standard Library that have associated type requirements, along with their proposed primary associated type (if any), as well as a list of other associated types.

Protocol Primary Others
Identifiable ID --
Sequence Element Iterator
IteratorProtocol Element --
Collection Element Index, Iterator, SubSequence, Indices
MutableCollection Element Index, Iterator, SubSequence, Indices
BidirectionalCollection Element Index, Iterator, SubSequence, Indices
RandomAccessCollection Element Index, Iterator, SubSequence, Indices
RangeReplaceableCollection Element Index, Iterator, SubSequence, Indices
LazySequenceProtocol Elements Element, Iterator
LazyCollectionProtocol Elements Element, Index, Iterator, SubSequence, Indices
SetAlgebra Element ArrayLiteralElement
OptionSet Element ArrayLiteralElement, RawValue
RawRepresentable RawValue --
RangeExpression Bound --
Strideable Stride --
Numeric -- IntegerLiteralType, Magnitude
SignedNumeric -- IntegerLiteralType, Magnitude
BinaryInteger -- IntegerLiteralType, Magnitude, Stride, Words
UnsignedInteger -- IntegerLiteralType, Magnitude, Stride, Words
SignedInteger -- IntegerLiteralType, Magnitude, Stride, Words
FixedWidthInteger -- IntegerLiteralType, Magnitude, Stride, Words
FloatingPoint -- IntegerLiteralType, Magnitude, Stride, Exponent
BinaryFloatingPoint -- IntegerLiteralType, FloatLiteralType, Magnitude, Stride, Exponent, RawSignificand, RawExponent
SIMD Scalar ArrayLiteralElement, MaskStorage
SIMDStorage -- Scalar
SIMDScalar -- SIMDMaskScalar, SIMD2Storage, SIMD4Storage, ..., SIMD64Storage
Clock Instant --
InstantProtocol Duration --
AsyncIteratorProtocol -- (1) Element
AsyncSequence -- (1) AsyncIterator, Element
GlobalActor -- ActorType
KeyedEncodingContainerProtocol -- (4) Key
KeyedDecodingContainerProtocol -- (4) Key
ExpressibleByIntegerLiteral -- IntegerLiteralType
ExpressibleByFloatLiteral -- FloatLiteralType
ExpressibleByBooleanLiteral -- BooleanLiteralType
ExpressibleByUnicodeScalarLiteral -- UnicodeScalarLiteralType
ExpressibleByExtended-
GraphemeClusterLiteral
-- UnicodeScalarLiteralType, ExtendedGraphemeClusterLiteralType
ExpressibleByStringLiteral -- UnicodeScalarLiteralType, ExtendedGraphemeClusterLiteralType, StringLiteralType
ExpressibleByStringInterpolation -- UnicodeScalarLiteralType, ExtendedGraphemeClusterLiteralType, StringLiteralType, StringInterPolation
ExpressibleByArrayLiteral -- ArrayLiteralElement
ExpressibleByDictionaryLiteral -- Key, Value
StringInterpolationProtocol -- StringLiteralType
Unicode.Encoding -- CodeUnit, EncodedScalar, ForwardParser, ReverseParser
UnicodeCodec -- CodeUnit, EncodedScalar, ForwardParser, ReverseParser
Unicode.Parser -- Encoding
StringProtocol -- Element, Index, Iterator, SubSequence, Indices, UnicodeScalarLiteralType, ExtendedGraphemeClusterLiteralType, StringLiteralType, StringInterPolation, UTF8View, UTF16View, UnicodeScalarView
CaseIterable -- AllCases

Notes:

(1) AsyncSequence and AsyncIteratorProtocol logically ought to have Element as their primary associated type. However, we have ongoing evolution discussions about adding a precise error type to these. If those discussions bear fruit, then the new Error associated type would need to also be marked primary. To prevent source compatibility complications, adding primary associated types to these two protocols is deferred to a future proposal.

Note that once we add primary associated types to a protocol, we will be forever stuck with them -- removing primaries, reordering the list, or (as of SE-0346) extending an existing list with a new type would all be source breaking changes.

Notable points:

  1. AsyncSequence and AsyncIteratorProtocol are intentional omissions -- given that there are ongoing discussions about adding an Error type to these, it seems prudent to leave these out until we know more. (Per SE-0346, adding Error as a second primary would be a source-breaking change.)

  2. LazySequenceProtocol and LazyCollectionProtocol use Elements as their primary associated type, not Element. The idea here is that if we spell out LazyCollectionProtocol, we probably want to also say something about the wrapped collection, not just its element type -- for example, by constraining it to random access collections using some LazyCollectionProtocol<some RandomAccessCollection<String>>.

  3. We do not want protocols to declare multiple primaries unless we wanted to force every use site to list all of them, even if they only care about some. (E.g., if for some reason Collection declared Element and Index as its primaries, then every single use site that wanted to use the bracketed syntax would need to spell out an index type -- which would be quite unwieldy.)

  4. KeyedEncodingContainerProtocol and KeyedDecodingContainerProtocol originally had Key as their primary associated type. This was removed during the course of this discussion.

As of Swift 5.6, the following public protocols don't have associated type requirements, so they are outside of the scope of this proposal.

Equatable, Hashable, Comparable, Error, AdditiveArithmetic,
DurationProtocol, Sendable, UnsafeSendable, Actor, AnyActor, Executor,
SerialExecutor, Encodable, Decodable, Encoder, Decoder,
UnkeyedEncodingContainer, UnkeyedDecodingContainer,
SingleValueEncodingContainer, SingleValueDecodingContainer,
ExpressibleByNilLiteral, CodingKeyRepresentable,
CustomStringConvertible, LosslessStringConvertible, TextOutputStream,
TextOutputStreamable, CustomPlaygroundDisplayConvertible,
CustomReflectable, CustomLeafReflectable, MirrorPath,
RandomNumberGenerator, CVarArg

Feedback would be most appreciated -- did I miss anything?

29 Likes

It might be worth grouping the table, into protocols which are

  1. Candidates for this proposal:

    • they declare or inherit associated types.
  2. Not applicable:

    • they currently don't have associated types;
    • marker protocols (Sendable and UnsafeSendable);
    • type aliases (Unicode.Encoding and Unicode.Parser).

Are KeyedEncodingContainerProtocol and KeyedDecodingContainerProtocol commonly used?

3 Likes

I'd like to echo a concern from @Karl from the review thread - some Identifiable<Int> reads like "identifiable int"... which is not what this is. Given the accepted syntax (that does not require associated type names), I don't think it's a good idea to give Identifiable a primary associated type due to the possible confusion...

14 Likes

In the context of this discussion, I can totally appreciate how much this makes sense; but I am quite certain that in the heat of the moment using the protocol this would be a source of confusion. Since we'd be forever stuck with what we adopt, I wonder if in the first iteration of adoption we are better off to avoid making permanent choices that lend themselves to explanations in footnotes.

This would apply to Strideable as well, though, and more broadly to any *able (and *ing) protocol (since the "-able" always refers to an ability of Self and the primary associated type would never be Self).

It'd be pretty undesirable to avoid the use of a feature that's specifically for protocols on the large proportion of protocols which are named exactly as we recommend them to be. Rather, since the accepted syntax is what it is and the accepted naming recommendations for protocols are what they are, I think it's important that we use them together consistently and through experience have users come to read it in a way that "works."

14 Likes

Happy to see AsyncSequence and its iterator protocol being excluded. It'd be a real pain point if the issue of the missing Failure (not Error) associated type wouldn't be addressed. Unfortunately I think we're stuck with this issue until Swift gets some form of typed throws. Until then AsyncIteratorProtocol cannot be properly patched.

1 Like

I hope that works out :slightly_smiling_face: As I understand it, all these proposals exist at least in part to make generics more accessible to new users. Having to explain to new users that "yes, it does look like an Int that's identifiable, but you should instead read it in this different way" doesn't seem to gel with that.

Intuitively I don’t feel that Identifiable should have a primary associated type, but this clearly shows that we need a guideline for when to use PATs Primaries. The ones that have been mentioned so far are:

  • The subject type of functors, or “types that look like collections or publishers”
  • Types such that one would expect a generic implementation to parameterize them

Neither of these is ready to copy-paste into TSPL (although, while the first one was mine, I think the second is probably better).

3 Likes

Mmm, I like this...

I think it’s “right”, but not really very helpful to people who aren’t already comfortable designing generic types.

1 Like

I 100% get this, but on the other hand, this is an inherent trade off in this feature -- we don't have labels for things within angle brackets, so it isn't at all obvious what Int means in MyProtocol<Int> or MyGenericStruct<Int>. To understand these, I have to go look at the declaration/documentation of the protocol/struct.

(Incidentally, the first cut of SE-0346 highlighted that we don't have a formal way to add API docs for type parameters of a generic type or generic function.)

If we were to take this objection seriously, then that would effectively mean that SE-0346 will be strictly limited to Sequence-style container protocols. Then again, I'm sure even Collection<Int> will confuse somebody: is it providing a collection interface to access individual bits in an integer?

My impression from reading the Core Team's verdict is that this is intended to be a general-purpose alternative to the classic generic declaration syntax, not some niche feature that is dedicated to sequences.

2 Likes

Yeah, that was the suggestion Keith Bauer made in the first SE-0346 review.

That works! But it doesn't necessarily make things obvious.

In the specific case of LazySequenceProtocol/LazyCollectionProtocol, this recommendation still leads to choosing Elements as the primary associated type rather than Element.

public struct LazySequence<Base: Sequence> {...}
public struct LazyPrefixWhileSequence<Base: Sequence> {...}
public struct LazyMapSequence<Base: Sequence, Element> {...}
...

I'm also noticing that the meaning of RawRepresentable<String> seems pretty obvious to me, despite the protocol using the *able naming convention, and despite the fact that it doesn't fit this recommendation -- raw-representable types aren't typically generic over their raw value.

Yeah, it's an intuitively appealing rule, but I don't mean to suggest that it's a perfect one. Probably the biggest problem is that it's hard to imagine generic models for a lot of these protocols. I don't know what a generic Clock would look like, for example.

A different rule might be that you can easily find the right simple preposition to read the type out loud:

  • Collection of Int
  • Identifiable as Int
  • SIMD of Float
  • Strideable with Int

I'm not sure if that rule gives us any clear counterexamples where we wouldn't have a primary associated type, though.

8 Likes

Scratching at this a little more -- as long as I am even vaguely familiar with what Identifiable is for, "identifiable integer" seems like an obviously wrong, surface-level misinterpretation. As convincing this objection appears at first glance, it may not hold much actual water.

A bit off-topic:

Reading this particular example makes we wonder if the whole feature could have been designed differently where there would be no change to the actual protocol required while we‘d introduce a primary alias of some sort. Strawman syntax:

// or just a typealias in general
primaryalias CollectionOf<Element> = Collection where .Element == Element

CollectionOf<Int>
IdentifiableAs<Int>
SIMDOf<Float>
StrideableWith<Int>

:thinking:

Unfortunately the ship has sailed for that.

4 Likes

That feels like it would be even more confusing, as we'd lose the familiar protocol names.

If the language allowed labels for type parameters, generics might have ended up looking something like this:

Array(of: Int)
Dictionary(key: Int, value: String)
Range(of: String.Index)

This would've helped here:

Collection<of: Int>
Identifiable<by: Int>
Strideable<with: Int>

Of course, that ship has sailed, returned, sailed again, capsized, and is now resting somewhere at the bottom of the ocean. :smiling_imp:

19 Likes

I still have hopes for labels on generic type parameters though. We don‘t have them now, but maybe one day we will. I‘d love to see something like this:

struct Tuple</* generic labels on the generic pack */>: ExpressibleByTupleLiteral
enum OneOf</* generic labels on the generic pack */>

let value_1: Int | String = .1(”swift”)         // OneOf<Int, String>
let value_2: (a: Int | b: Int) = .a(42)         // OneOf<a: Int, b: Int>
let value_3: (x: Int, y: Int) = (x: 0, y: 1)    // Tuple<x: Int, y: Int>
let value_4: (label: String) = (label: ”swift”) // Tuple<label: String>
3 Likes

I agree it should be Elements. Elements is kind of what I refer to as Base in this post on the review of primary associated types:

I am not actually sure Elements will actually prove useful to anyone as a primary associated type, but it probably won't do any harm.

1 Like

I was going to make this kind of list, but then I thought of Optional Int and, well…there's no preposition that goes there. We named the Optional enum with an adjective instead of a noun, and it is supposed to be read that way, so much so that we have the Int? spelling for it. So Identifiable Int is a natural reading even if we'd prefer people to say Identifiable by Int. (That doesn't mean we can't do Identifiable<Int>, though, just that there is that potential for confusion.)

3 Likes

Good idea -- I updated the original post accordingly. I also listed the rest of the associated types for each protocol. (Check out the list on StringProtocol. :nerd_face::face_with_raised_eyebrow:)

Note that Unicode.Encoding and Unicode.Parser are considered public protocols that currently happen to be typealiases to underscored protocol definitions, for very annoying technical reasons. (Which we should eventually get around to resolving...)

No -- in fact, given that Encoder/Decoder sadly forces these into type-erased boxes, there is very little reason for anyone to write generic functions over them. Therefore, there is no reason for these to have a primary associated type. :disappointed:

I updated the table accordingly.

4 Likes