Ergonomics: generic types conforming "in more than one way"

The stdlib collection types implement index(_:offsetBy:) unconditionally (or at least, with no condition beyond Collection).

If you move index(_:offsetBy:) from the conditional extension into either the main body of CountSteps1 or an unconditional extension thereof, then it will work.

The reason your example crashes is twofold. First, offsetEnd(by:) is written in an extension of Collection. So within its body, all the compiler knows statically is that Self: Collection.

The call to index(_:offsetBy:) resolves to the function which satisfies the conformance of Self to the Collection protocol. And for CountSteps1 that is the default implementation from the stdlib, which traps on negative offsets.

Second, the conditional extension on CountSteps1 does not take part in its conformance to Collection, so the methods therein are not implementations of Collection requirements. (They are, however, implementations of BidirectionalCollection and RandomAccessCollection requirements.)

By moving those implementations out of the conditional extension and into an unconditional one (or the main body of the type), then they become part of the Collection conformance.

1 Like

Oh, I see now: the standard library just gives up on performance guarantees for its conditionally-conforming RandomAccessCollections :frowning:. That is really sad. I never would have allowed FlattenRandomAccessCollection to be collapsed in this way until we had a way to prevent that.

The reason your example crashes is twofold…

Yeah, I know why my example crashes; I designed it to show how the existing system is broken.

[Edit: The following is all wrong; see below]

The point is that if the type (CountSteps1) were a legit wrapper over its base, it would use the base's index to implement its own, and those operations wouldn't be available to it in the main body of the type, making the problem completely unfixable short of beautiful/awful existential hacks.

https://bugs.swift.org/browse/SR-12709

I believe that is not correct. The conditionally-conforming types provide their own implementations of Collection requirements, which forward to the base collection. Therefore, if the base provides performant random-access guarantees, then so does the wrapper.

(Where possible that is—of course LazyFilterCollection can only provide Bidirectional, not RandomAccess, whereas LazyMapSequence can do both.)

If CountSteps1 were a legit wrapper over its base, it would use the base’s index to implement its own, and it would forward index(_:offsetBy:) to its base, and that operation would be available to it in the main body of the type, making the problem entirely nonexistent.

My bad; FlattenCollection can't be random access. Sorry for the noise.

But it can be improved:

This is its implementation of index(_:offsetBy:), which is indeed only conditioned on T.Element: Collection. (The limitedBy version is a few lines down. The following applies to both, as well as distance(from:to:) and maybe some other stuff.)

Currently, its index(_:offsetBy:) method works by repeatedly stepping the index one position at a time. But it could be rewritten to “hop”, by checking the count of each outer element in turn. (The first one of course would check the distance from the starting point to whichever end is in the proper direction.)

When the outer elements are RandomAccess, this means it would not have to walk the inner elements at all. It would happily hop along the outer elements, subtracting their counts until the distance remaining to travel reaches (or crosses) zero. Then it would drop into the correct outer element, and hop to the desired position.

The tradeoff here is that in the non-RandomAccess case it would go more steps than necessary, because it would have to find count for whichever outer element contains the destination, rather than stopping immediately when it reaches the destination.

That could probably be rectified through some cleverness, since it does know the remaining distance ahead of time.

Basically, we would want something like index(_:offsetBy:limitedBy:) that returns a tuple of (indexReached: Index, actualDistanceTraveled: Int). Or alternatively, distance(from:to:butNotMoreThan:) with a similar return type.

This really needs some elaboration, because this is an area where implementation constraints have a significant impact on what is possible with the language design. The current implementation model is that a protocol conformance is described by a "witness table", which is a mapping of each protocol requirement to its "witness", i.e., the one declaration that satisfied that particular requirement. With generic types, that witness has to be one that works for any generic arguments. To get a different witness for different generic arguments, which is fundamentally what this thread is about, requires the Swift runtime to inspect the capabilities of the actual argument types dynamically, and perform some decision procedure to pick the best witness. There are effectively two places where we can introduce a decision procedure that works with the current runtime:

  • Within the implementation of a function itself. For example, if a single function had multiple bodies (e.g., each with a different where clause), the decision procedure would be at the beginning of the witness function.
  • When the witness table for a concrete type's conformance to a protocol is realized (e.g., Array<Int> conforming to P in @dabrahams's example). Here, the decision procedure would be in some runtime hooks we have during witness table realization. We can make decisions for each protocol requirement (independently).

Now, the fact that this decision process occurs at runtime implies a few things about our design constraints:

  • The process should be fast. We'll have to accept some dynamic lookups (e.g., does Int conform to Equatable?), but we probably have to avoid global lookups that produce an unknown set of potential candidates to rank.

  • The process should never fail. Overload resolution and similar ranking metrics have ambiguities. We should either define away those ambiguities (i.e., there is always a single best answer) or we need some kind of fallback (warn and pick at random? pick what today's language picks?).

    Doug

6 Likes

If we take the "multiple bodies with different where clauses" approach, it seems to me like we could bubble up the information in those where clauses to the conformance, so that we know statically that the conformance needs to take added conditional arguments in order to instantiate the witness table, so that we could still pass down the Int: Equatable conformance (or lack thereof) at the point of instantiation for the witness table instead of doing a global search.

Thanks for your post, Doug!

Would you be open to thinking differently about how this can work? I've always had in mind a system where these determinations can be made at compile time (while still allowing separate compilation of course). I've always thought viewing something that's semantically static through a dynamic lens has been limiting our implementation choices, and I think there are other possibilities.

Whether we can even begin that conversation depends on whether you agree that generics should/could be semantically static. If not, the options are indeed much more limited.

I don't think it's just about viewing the problem through a "dynamic lens". The important property I think we want to maintain in the generics system is modularity. That allows us to implement generics with semantically equivalent dynamic and static implementations, but that's not the only benefit; it also allows for separate compilation and analysis, avoids introducing fundamental compiler performance problems, and minimizes side effects from adding additional packages to a system.

Of course we want modularity, motherhood, :apple: :pie: and all other good things :wink:

I don't know why you think that's related to what I brought up. What I have in mind is modular and preserves all of the properties you listed except possibly the thing about semantically equivalent static and dynamic implementations—depending on what you mean by that—whose value I am inclined to question.

I think what I have in mind could even improve the modularity story, but it's hard to tell because you haven't been very precise about the properties you listed. It would be really helpful if you could unambiguously describe the semantics you think are crucial, ideally with examples that illustrate the differences between the results you want and those you would like to prevent, so we can evaluate both those judgements and any proposals that may arise from this thread.

Thanks,
Dave

In cases where we know it statically, yes, and that's a common case I bet we can optimize. In the general case, we won't know until runtime that the type will be Int, and someone will have to do the dynamic lookup.

Doug

I don't know what you have in mind, but I can spell out what I know is implementable (relatively) efficiently and where I see things as becoming problematic. Let's revisit your original example. The parts relevant to my examples are these:

// (A) Definition
protocol P {
    static var isEquatable: Bool { get }
}

// (B) Default implementation
extension P {
    static var isEquatable: Bool { false }
}

// (C) More-constrained default implementation
extension P where Self : Equatable {
    static var isEquatable: Bool { true }
}

// (D) Conformance
extension Array : P {}

What we get today is that, at the point (D) where Array is stated to conform to P, the compiler looks at both implementations (B) and (C). For this case, (C) is not applicable because the Array<Element> for an arbitrary Element is not guaranteed to be Equatable. Therefore, the compiler chooses (B) and we get the surprising behavior.

Now, we could imagine a different, conditional conformance of Array to P that only works for Element types that are themselves Equatable (and, thus, the Array is Equatable). Let's call it D' because I've always wanted to be a mathematician:

// (D') Alternative, conditional conformance
extension Array : P where Element: Equatable {}

Now, because we know that Array<Element> will be Equatable, both of the isEquatable implementations (B) and (C) are usable (their requirements are met), so we have to make a choice---should it be (B) or should it be (C)? Well, (C) is more specialized, i.e., it has additional requirements on top of (B), and therefore is assumed to be better (faster, more accurate, etc.), so we pick that one. That's what happens, as expected.

Now, what we want is to get (C) for when the Element type happens to be Equatable, and (B) when it is not Equatable. By looking at (B) and (C), we can encode a decision procedure here, and it's pretty easy. At the point where we know the Element type---whether it happens in the compiler's type checker, optimizer, or at runtime, we'd prefer that it not matter, as @Joe_Groff points out---we want to effectively do this check:

  if Element is Equatable {
    use implementation (C)
  } else {
    use implementation (B)
  }

I believe that's completely implementable. The compiler, seeing both (B) and (C), can emit that if/else block to determine which function to use for the witness of the conformance, so conformance (D) has the behavior we want.

But, there are limitations, so let's see where the model breaks.

First of all, there can be ambiguities. Let's extend the example with another protocol Q and another implementation of isEquatable:

protocol Q { }

// (E) Another constrained implementation
extension P where Self : Q {
    static var isEquatable: Bool { true }
}

// (F) Array conditionally conforms to Q
extension Array: Q where Element: Q { }

Now, conformance of Array to P has three potential implementations of isEquatable to consider: (B), (C), and (E). (C) is more specialized than (B), and (E) is more specialized than (B), but (C) and (E) can't really be ordered. So if we have a type like this that conforms to both Equatable and Q:

struct X: Equatable, Q { }

What happens for the conformance of Array<X> to P? All of (B), (C), and (E) apply, so I think this must be an error. If we model this case today by extending the conformance of Array to P with both conditions on its Element type, like this:

// (D'') conformance with an ambiguity
extension Array : P where Element: Equatable, Element: Q {}

we end up getting an ambiguity error:

t2.swift:2:16: note: multiple matching properties named 'isEquatable' with type 'Bool'
    static var isEquatable: Bool { get }
               ^
t2.swift:6:16: note: candidate exactly matches
    static var isEquatable: Bool { false }
               ^
t2.swift:10:16: note: candidate exactly matches
    static var isEquatable: Bool { true }
               ^
t2.swift:19:16: note: candidate exactly matches
    static var isEquatable: Bool { true }
               ^

I consider that to be the correct behavior here, because there is no "best" answer. I would be fine with the original conformance of Array to P (marked (D) at the beginning) becoming ambiguous and triggering an error at compile time. I think it would be a major problem with the design if this ambiguity could happen at run time.

To be clear, the fix here is not too complicated, and could possibly even be suggested by the compiler. You effectively need to provide another constrained implementation that is more specialized than both (C) and (E):

// (G) Yet another constrained implementation, better than the others!
extension P where Self : Equatable, Self: Q {
    static var isEquatable: Bool { true }
}

That fixes the problem, and we still have a (relatively simple) decision algorithm:

switch (Element is Equatable, Element is Q) {
  case (true, true): use implementation (G)
  case (true, false): use implementation (C)
  case (false, true): use implementation (E)
  case (false, false): use implementation (B)
}

This depends on having the complete set of potential witnesses visible at the point where the conformance is declared. If we were to move the definition of the protocol Q and the corresponding protocol P extension implementation (E) into another module, we'd lose the ability to select implementation (E) because it's not visible to us at build time. I consider that a good thing, but I could certainly imagine it causing some confusion when factoring code into a different module changes the overall behavior. And it's not something that I consider fixable, because even if you take the performance hit of doing a runtime lookup to find all of (B), (C), and (E), you've lost the ability to ensure (prior to runtime) that there can never be an ambiguity.

Doug

12 Likes

I appreciate the extensive reply, and I will dig into it in detail, but can we please first answer my earlier question of whether you consider it acceptable to say that all programs without existentials or non-final classes are semantically static? My definition of that term is simple: a semantically static program can be implemented without type-based dynamic dispatching provided all the sources are available to the compiler.

There's a whole class of backward-compatible ways you might extend Swift generics (or to "fix" things that are broken today, such as cross-module dispatching) that would make the semantics of some such programs dynamic. I want to know if you consider any of those points in the design space essential destinations for the language. The answer determines both how I understand the possible futures of Swift's generics system, and whether my ideas are worth discussing at all in this forum.

2 Likes

I think the important thing is not whether the type is Int, but whether the conformance's behavior is conditional on the availability of other conformances on the type or its associated types. Since we don't have any way currently of abstracting over constraints, the set of conditional arguments to a conformance seems to me like it should always be statically knowable.

I agree that it is statically knowable, but I have concerns with having the caller compute this information:

  • You would have to compute the union of all potential constraints to pass in (e.g., both Equatable and Q in my extended example), including evaluating all dependent constraints (e.g., you might need to evaluate both T: Collection and T.Element: Equatable for a different form of the example. Some of that computation will be redundant based on the decision procedure.

  • This makes the set of potential witnesses ABI, which I don't think we want for libraries that build with library evolution.

    Doug

Having the caller compute the information seems to me like the only way to maintain modularity, and thereby retain the ability to specialize in the optimizer and have any hope of rational behavior in the presence of retroactive conformances. It also seems to me like conditional witnesses ABI is not that terrible of an ABI constraint; if you add new conditional requirements, there's reasonable retroactive defaults for them either being "not present" in existing code, or falling back to dynamic lookup if that happens to work better in practice. If the conformance evolves in the other direction not to need the conditional requirements anymore, then the requirements become redundant ignored arguments, which is maybe inefficient but at least is not constraining.

2 Likes

As I read it, "semantically static" is equivalent to saying that one can monomorphize a Swift program without existential or non-final generics, as in (e.g.) Rust's generics system, specializing your way into having no dynamic dispatch.

That is almost true, although the ability to construct infinite types at runtime means it's not completely true. For example, run this:

protocol InfiniteTypes {
  associatedtype Assoc: InfiniteTypes
  var assoc: Assoc { get }
}

struct ToInfinity<T>: InfiniteTypes {
  var assoc: ToInfinity<Self> { .init() }
}

func recursive<T: InfiniteTypes>(_ value: T) {
  print(T.self)
  if Int.random(in: 0..<100) != 42 {
    recursive(value.assoc)
  }
}

recursive(ToInfinity<Int>())

You can't monomorphize that, because each recursive step builds a new ToInfinity<...> type.

Doug

8 Likes

That's what I thought I meant…

That is almost true, although the ability to construct infinite types at runtime means it's not completely true.

No fair; you just broke my brain.

/me makes repairs

I guess I don't think of this program as requiring type-based dynamic dispatch, because calls to recursive always end up in the same function body, regardless of the type of the argument. I am talking about being able to decide the program's control flow without having to make decisions based on types at runtime.

2 Likes

I’m pretty sure you can use mutual recursion with @Douglas_Gregor’s scheme to produce Turing-complete nested generics.

It’s, uh…left as an exercise to the reader! :-)

1 Like

Generally polymorphic recursion is supported. You can try it out

indirect
enum PolyList<A> {
    case empty
    case nonempty(A, PolyList<[A]>) // Yes [A] not A
}
protocol Monoid {
    static var empty: Self { get }
    func append(_ rhs: Self) -> Self
}
extension Array : Monoid where Element: Monoid {
    static var empty: Self { get { [] } }
    func append(_ rhs: Self) -> Self {
        return self + rhs
    }
}
func combine<A : Monoid>(_ list: PolyList<A>) -> A {
    switch list {
    case .empty: return A.empty
    case let .nonempty(x, list):
      return combine(list).reduce(x, {$0.append($1)}) // A gets instantiated to [A] here!
    }
}

As to why you'd want to do this, I asked a StackOverflow question which might be relevant: Applications of polymorphic recursion.

3 Likes
Terms of Service

Privacy Policy

Cookie Policy