SE-0358: Primary Associated Types in the Standard Library

Bringing this review back to the top now that WWDC is over.

2 Likes

I tried to browse the search results. The vast majority were copies (not forks) of the standard library. Several others were for iterating an option set, all based on the same answer from Stack Overflow. I didn't find any interesting code, but that isn't an argument for or against OptionSet<RawValue>.

Codable.swift has a default implementation for option sets and enums. Your serialization example could use either some RawRepresentable<Int>; or a protocol composition of some OptionSet & RawRepresentable<Int> is also supported by SE-0346.


Option sets consume/produce either Element or Self. They're expressible by an array literal of Elements. Their RawValue is usually an implementation detail.

I can't think of a suitable preposition (from the second guideline) to describe OptionSet<RawValue>.


Have the guidelines been tested against the SDK? The following are defined in Foundation, but I'm unfamiliar with their usage.

Protocol Associated Types
SortComparator Compared
ReferenceConvertible ReferenceType
DecodableWithConfiguration DecodingConfiguration
DecodingConfigurationProviding DecodingConfiguration
EncodableWithConfiguration EncodingConfiguration
EncodingConfigurationProviding EncodingConfiguration
AttributeScope DecodingConfiguration, EncodingConfiguration
FormatStyle FormatInput, FormatOutput
ParseableFormatStyle FormatInput, FormatOutput, Strategy
ParseStrategy ParseInput, ParseOutput
DataProtocol Regions, Element == UInt8, Index, Iterator, SubSequence, Indices
MutableDataProtocol Regions, Element == UInt8, Index, Iterator, SubSequence, Indices

To clarify, FooImpl<T> is a stand-in here for those uses of generics which are clear (that is, Array<Element>, etc.)—to reject the premise would be to reject that there can be any uses of generic parameters which are clear at the point of use, which I don't think is what you mean. I think Array<Element> is plenty clear, for example.

Hence, why I proposed a formulation of a rule that extends beyond Collection-like protocols, although it seems your point (2) is that it doesn't extend too far. That would be a fair critique.

One point of concern here is that Strideable types sometimes but not always are their own Stride, and having worked with (on, rather) this protocol quite extensively I can say that keeping the distinction straight is so important for the correctness of generic code.

I really, really want to emphasize again how much I disagree with this proposed metric of usefulness. I really think it is a serious mistake to refer to the number of extensions in the defining library as the metric for providing some feature for end users.

One of the key reasons we add new APIs to the standard library (recalling the core team's guidance on when something meets the bar for inclusion) is that something commonly used is not trivially composed from existing APIs but rather difficult to implement correctly (lots of corner cases, etc.). As the author of a chunk of those extensions you list that constrain Strideable, I can say that they are safely in the "Don't try this at home, kids!" category of tricky, and some of the constraints I wrote (extension Strideable where Stride: FloatingPoint and extension Strideable where Self: FloatingPoint, Self == Stride—refer to my point above about distinguishing the striding and strided types) are directly because of the trickiness of implementation.

Referring to the number of times we actively choose to put work into the standard library dealing with corner cases so that users don't as a reason to provide users with an easier way to do the same thing is not the correct metric: I feel very strongly about this.

Because the syntax can be added in a future version in an ABI-compatible way but not removed if it's added now, and because I do have concerns about active harm (see above—namely, encouraging folks to parameterize protocols to write overly generic algorithms that are tricky to get right due to underspecified semantics, on the basis of where we are using constraints in the standard library for implementation detail purposes precisely to steer users away from that), my humble opinion is that we ought to be erring strongly on the other side and omitting adoption until we have a better sense of usefulness.

8 Likes

I feel similarly.

In particular, this language (though I realize it's not currently part of the proposal text):

would to me suggest that basically every protocol with a single associated type ought to make it primary. The noted exception of ExpressibleByIntegerLiteral doesn't to me provide very useful guidance, given that it exists basically to support compiler 'magic'. The question we repeatedly ask on this forum whenever new protocols are proposed or discussed is 'what useful generic algorithms would this enable?', so excluding protocols which aren't designed to be used in a generic context would seem to exclude almost nothing.

Also, it seems a bit odd to simultaneously allow multiple primary associated types at the language level, while discouraging ever using them in the design guidelines. If we think that their use is so rarely useful and likely to be confusing, should we potentially reconsider allowing multiple primary associated types at all, at least for Swift 5.7?

This just makes life difficult for someone who does have a use case for them, for a type that isn’t of the same nature as those in the standard library.

2 Likes

We should, though, be able to articulate the circumstances in which we would recommend such use. If you are that someone, would you be able to share what use cases you have in mind and what the guidance should then be?

I think it would be a yellow flag, at least, that we have made available a feature and then immediately recommended against its use. There ought to be something (even if niche) that can demonstrate how the feature carries its own weight.

3 Likes

Swift currently has many issues stemming from the legacy tuple splat. Once Swift gets variadic generics, the language could explicitly model (Int, Int) and (a: Int, b: Int) as distinct types that both conform to protocol Tuple<T...> where T... == (Int, Int), or in other words, Tuple<Int, Int>.

This would also apply to functions, which could be modeled as conforming to protocol Function<Args..., Return>. Significantly, this gives us a way to spell the type of generic closures: type(of: { _ arg: some BinaryInteger in arg.bitWidth }) == some Function<BinaryInteger, Int>.


Perhaps less esoterically, a protocol PairwiseIterator<Left, Right> might be handy for zip-like algorithms.

My question was rather geared towards examples of non-hypothetical protocols in use today that would benefit from support for multiple primary associated types in Swift 5.7, from which we can study guidelines that aren’t just “don’t do it.”

Wouldn’t the element type have to be (Left, Right), and thus give good reason for the use of a single primary associated type—i.e., PairwiseIterator<(Left, Right)>? This is why concrete examples are key here.

1 Like

This is what I'd want to see as well. The proposal gestures towards this:

Of course, if the majority of clients actually do want to constrain both Key and Value , then having them both marked primary can be an appropriate choice.

but this feels a bit under-specified to me, especially when it appears under the pretty unequivocal heading Limit yourself to just one primary associated type.

This (and the tuple example) suggests to me a potential guideline along the lines of:

For a protocol with two (or more) associated types that are semantically symmetric (i.e., they could be swapped and the protocol would behave basically the same) it may be appropriate to make both primary.

but that feels fairly narrow and doesn't encompass the hypothetical Function protocol. Good real world examples would be quite valuable here, I think.

1 Like

Just because the Element has to be a tuple doesn’t mean that the tuple should be the protocol’s single associated type. Why force programmers to type the extra parentheses? In fact, without additional constraints imposed by the protocol requirements, a protocol can’t actually enforce that its single associated type be a 2-tuple. That makes two good reasons for using two associated times, IMO.

1 Like

(Edit: I added some extra detail at the top.)

The number of constraining cases in the defining library is, from my experience, a good indication of similar use outside of it. For the specific case of Strideable, a quick search of public repositories on GitHub readily uncovers a number of cases that would benefit from the more readable constraint syntax:

// Current code:
    public init<V>(
      value: Binding<V>, 
      step: V.Stride = 1, 
      onEditingChanged: @escaping (Bool) -> Void = { _ in }, 
      @ViewBuilder label: () -> Label
    ) where V: Strideable {...}
// Proposed option:
    public init(
      value: Binding<Stride<Int>>, 
      step: Int = 1, 
      onEditingChanged: @escaping (Bool) -> Void = { _ in }, 
      @ViewBuilder label: () -> Label
    ) {...}
// Current code:
extension BidirectionalCollection 
where Index: Strideable, Iterator.Element: Comparable, Index.Stride == Int {...}
// Proposed option:
extension BidirectionalCollection 
where Index: Strideable<Int>, Iterator.Element: Comparable {...}
// Current code:
public func +--><Bound:Strideable>(lhs: Bound, rhs: Bound) -> Range<Bound> 
where Bound.Stride == Bound {
// Proposed option:
public func +--><Bound: Strideable<Bound>>(lhs: Bound, rhs: Bound) -> Range<Bound> {
// Current code:
extension TotallyOrderedSet where Element:Strideable, Element.Stride:SignedInteger {
// Proposed option: (assuming future work in this area)
extension TotallyOrderedSet<Strideable<SignedInteger>> {
// Current code:
public struct Real<T> : RealNumber where T:Strideable, T.Stride == T {
// Proposed option: (assuming future work in this area)
public struct Real<T: Strideable<T>> : RealNumber {

How many more of these examples do we need?

The new lightweight constraint syntax is, as explicitly stated, intended to (1)
make "generic programming in Swift feel more natural and approachable" by providing a less scary alternative to classic where clauses, and (2) to enable new features that weren't previously possible (namely, constraining opaque result types (SE-0346) and existential types (SE-0353)).

These goals are dependent on protocol authors adding the necessary primary associated types, starting with the Standard Library. I don't think it would be right for the API guidelines to second guess these goals by artificially restricting the new lightweight constraint syntax to a small subset of standard protocols or by discouraging their general use.

I do believe that for many people, T: FloatingPoint & Strideable<T> is going to be easier to read/understand than T: FloatingPoint & Strideable where T.Stride == T. To me, this alone is reason enough to support this syntactic variant.

I feel that your objection here is, at its core, a critique of Strideable itself, rather than the idea of giving it a primary annotation. I do agree with this assessment. I consider Strideable to have largely been an API design miss -- exactly because it's so impractical to define extensions on it without running into (mostly unresolvable) problems with e.g. handling overflow situations.

In addition, I do not think it would be overly difficult to deal with Strideable not providing a primary associated type. This is a protocol of marginal use. Will being able to type Stridable<Int> rather than <S: Strideable> where S.Stride == Int really make a difference either way? I sincerely doubt it.

However, here is my problem: how can we reasonably define what constitutes a useful enough primary associated type without inviting these sorts of subjective discussions every single time we consider a new protocol?

If we started fixing Strideable's flaws, then at exactly what point would it become an extensible enough protocol to gain a primary associated type? Would allowing a non-trapping overflow path on distance/advanced(by:) be enough? Do we need to expose a cleaned-up version of _step, too? Or is the concept of strideability somehow inherently incompatible with the shorthand constraint syntax?

What happens when someone complains that their preferred shorthand syntax doesn't work for Strideable, and supplies a good use case? Do we fire up a new Swift Evolution proposal every time that happens? Wouldn't it be more productive to shortcut subjective discussions about usefulness by encouraging protocols to gain primary types as long as the choice of which associated type to mark as primary is obvious?

I honestly don't see how we can draw a clearly defined API design line between Identifiable<ID>, RawRepresentable<RawValue> on one side and Strideable<Stride> on the other. The associated types in all three of these protocols are absolutely core to their existence, and so far I did not come across an objective reason not to give Strideable its obvious primary.

The heuristic you proposed earlier was to look at the type arguments of generic types that conform to the protocol. This does not help distinguish between these three cases.

All the shorthand notation does is that it provides an alternative way to spell generic constraints. It doesn't make it easier to implement such extensions, but it does make them superficially easier to read -- as long as the role of the type name within angle brackets is clear, which (I believe) is obviously true for Strideable<some FloatingPoint>. Where exactly is the harm in that?

Should we just err on the other side and restrict primary associated types to Element types on container/stream protocols? That would hamstring this language feature, but it would certainly make the API guidelines easier.

6 Likes

The second hit for me is a function with the following signature:

func data<T>(forInterpretation interpretation: T.Type) -> T 
where
T:OptionSet,
T.RawValue == UInt64

With the proposed changes, this could be optionally written as:

func data(forInterpretation interpretation: OptionSet<UInt64>)

The fifth hit is a struct declaration:

public struct OptionSetIterator<Element: OptionSet>: IteratorProtocol where Element.RawValue == Int

with the shorthand syntax, this could eventually be written as

public struct OptionSetIterator<Element: OptionSet<Int>>

I can see a couple more examples of this sort on the first page.

(Note: depending on how far ahead we go in the future, the replacements above may need a small pinch of some keywords.)

"An option set of UInt64" is something I would not find difficult to grok if I heard it in polite conversation.

That said, if you feel there is a real possibility of confusion here, then that sort of settles the issue -- this is a marginal case, and I do not think it's worth pushing it through if there is a high potential for persistent confusion. (Especially given that I myself have got it wrong on the first try!)

Have the guidelines been tested against the SDK? The following are defined in Foundation, but I'm unfamiliar with their usage.

Yes, although with the same caveat. :wink:

SortComparator<Compared> and ReferenceConvertible<ReferenceType> seem fairly obviously right to me -- I think the API guidelines should encourage adding primary associated types to these.

Superficially, FormatStyle<FormatInput, FormatOutput> and related protocols might be cases where it would make sense to have two primary assoc. types. However, I don't know enough about this family of format configuration protocols to say if it makes sense to constrain them on their input/output -- my instinct is yes, but I can't find any actual cases. (I do see they tend to be constrained in extensions of the form extension FormatStyle where Self == SomeStruct<T>, which I will not pretend to fully understand.)

I can maybe see an argument for [Mutable]DataProtocol<Regions>, but I feel it's not a particularly strong one.

AttributeScope is a helper protocol that enables nicer syntax for accessing attributes on an attributed string. I do not think it is designed to be constrained in practice, so my instinct is that it doesn't need a primary associated type.

I don't feel like the rest of these examples necessarily need primary associated type declarations, but I also have not gained enough practice using them to fully judge the matter.

The *WithConfiguration/*ConfigurationProviding protocols can go either way -- I am not familiar enough with their design to say whether it makes sense to constrain their associated types. FWIW, methods such as the one below do constrain them.

extension KeyedEncodingContainer {
  public mutating func encodeIfPresent<T, C>(
    _ t: T?, 
    forKey key: Self.Key, 
    configuration: C.Type
  ) throws
  where 
    T: EncodableWithConfiguration, 
    C: EncodingConfigurationProviding, 
    T.EncodingConfiguration == C.EncodingConfiguration
}

If we had primary associated types on EncodableWithConfiguration/EncodingConfigurationProviding, then the same declaration could be written as:

extension KeyedEncodingContainer {
  public mutating func encodeIfPresent<Config>(
    _ t: some EncodableWithConfiguration<Config>?, 
    forKey key: Self.Key, 
    configuration: (some EncodingConfigurationProviding<Config>).Type
  ) throws
}

It's shorter. Is it more readable? Arguably, yes. (For what it's worth, it is going to take a while until I get used to seeing explicit type parameters on the function declaration mixed with implicit type parameters arising from opaque parameter declarations -- but this is a direct consequence of SE-0341, not something we can (or should) reasonably handle through API design.)

Per the API guidelines I'm proposing, a minor use case like this would be enough to trigger the addition of the primary associated types. The role of Config in EncodableWithConfiguration<Config> seems eminently clear (as it happens, even more so than Array<Int>, as the WithConfiguration suffix serves as a sort of label for the type); this is less true for EncodingConfigurationProviding, but I feel there is still very little room for confusion. I'd be comfortable with the idea of these coding protocols supporting the shorthand syntax, but I also wouldn't press the matter if the folks who owned these protocols would decide not to do that.

2 Likes

Indeed! (It's interesting that we both allow niche use cases as long as it suits our argument. :stuck_out_tongue_winking_eye:)

The main reason the proposal does not encourage the use of multiple primary associated types is that I simply did not find enough practical examples to form a strong opinion on their use.

Multiple primary a.t.s come with a limitation that potentially restricts their practical usefulness: per SE-0346, clients must either constrain all primary types, or none at all. This isn't necessarily a big deal, but given the scarcity of real-life examples (and the even more dire lack of real-life experience with this feature in general), it does give me pause.

The dictionary protocol in the proposed guidelines come from SE-0346, and it's an example where I can see some utility in having both types marked primary:

protocol DictionaryProtocol<Key, Value> {
  associatedtype Key: Hashable
  associatedtype Value
  ...
}

We do not have such a protocol (yet!), but it certainly seems reasonable to expect that it would work this way. Doing so would enable us to, e.g., define functions returning some DictionaryProtocol<String, Any>, and (with a bit of cleverness) we can repeat the Hashable constraint to even define a function over the protocol that only really wants to constrain Value:

func sumOfValues(in dict: some DictionaryProtocol<some Hashable, Int>) -> Int

Is this too clever, though? And what if I don't care about the Value, rather than the Key?

func sumOfKeys(in dict: some DictionaryProtocol<Int, ???>) -> Int

The workarounds suggested in the proposal seem a bit unsatisfying to me, as they invariably require adding an explicit type parameter to the function, weakening the reason for preferring the new syntax.


Incidentally, but perhaps relatedly: the task of selecting primary associated types for a protocol feels very similar in spirit to the design task of choosing a list of type parameters for a generic type. The Swift API guidelines do not give much specific guidance about how to do the latter -- nor would I expect the guidelines to strictly dictate this matter, as the correct choice is very much dependent on the type's exact purpose.

In my head, the limitation that forces us to constrain all primary associated types very much rhymes with the limitation that we cannot supply default type values for generic type arguments. For better or worse, this prevents us from, e.g., using mixin type arguments to make, say, Hashable conformances customizable without having to manually wrap each Key in a custom type:

struct Dictionary<
  Key, 
  Value, 
  KeyHasher: Hashable & RawRepresentable = DefaultHasher<Key>
> where KeyHasher.RawValue == Key
{ ... }

(This is a very silly, non-workable example, I know -- but I hope it gets across a point.) The design of the language (intentionally or not) steers us away from defining types this way. We could technically do this sort of thing even without the ability to set defaults, but in practice spelling out all the type arguments in client code would be too annoying to consider in all but the most desperate cases. As a result, some things require more boilerplate, but we keep generic argument lists to the absolute minimum.

I suspect there is a sort of similar thing going on with multiple primary associated types. It ought to be possible to leave some of these positional "type arguments" unconstrained, but some DictionaryProtocol<String, _> isn't a thing we can spell. As of today it's unclear to me whether this limitation is going to be just a minor annoyance or a major obstacle -- and therefore I'd prefer if the API guidelines did not recommend such things until we know more.

2 Likes

That all makes sense to me, thanks for elaborating!

I think what maybe made me a bit confused was that the language from the proposal seems decidedly stronger than what you’ve expressed here—perhaps the guideline should be phrased more as “consider carefully before defining multiple primary associated types, it can result in a poor user experience” rather than the current “do not define multiple primary associated types?

The Language Workgroup talked about this proposal and decided that we simply lack the experience to codify the proposed guidelines as general guidelines for the language. Karoy has agreed to weaken the wording in the proposal to simply describe these guidelines as having been useful for the standard library and to request further input from the community.

Karoy has also come to accept @benrimmington's argument that OptionSet's primary associated type has confusability problems, and he is no longer proposing to make that change.

Because the second change is a substantive change to the proposal, albeit one which has seen some discussion in the review thread, this review is being extended until next Monday, June 27th, 2022. We will continue the discussion in this thread.

Thank you for your patience; these adoption/guidelines proposals are always tricky and highly subjective, and they tend to drag out.

John McCall
Review Manager

11 Likes

I haven't seen examples using collection protocol composition.

Would I need to repeat the element type?

MutableCollection<String> & RandomAccessCollection<String>

Could I use a generic type alias?

typealias MutableRandomAccessCollection<Element>
= MutableCollection<Element>
& RandomAccessCollection<Element>

typealias MutableRandomAccessRangeReplaceableCollection<Element>
= MutableCollection<Element>
& RandomAccessCollection<Element>
& RangeReplaceableCollection<Element>

I think I asked @Slava_Pestov this when primary associated types were first pitched and it's something that should eventually work but does not currently, but I can't find my post.

2 Likes

I found your post, and some recent tests using type aliases, so I think those are supported now.

Protocol compositions are also supported in that test, but not in all positions, or with existentials.

(There's a nice example of "literate programming" by @codafi.)

2 Likes

The review period for SE-0358 has come to an end, and the Language Workgroup has decided to accept the proposal as it currently stands. Thank you for your continued engagement with this long review.

John McCall
Review Manager

4 Likes