Default Implementations And Additional Conditional Conformances

performance

(Dennis Vennink) #1

(Old title: Overloaded Method Resolutions.)

I'm not 100% sure if this is a bug or a misunderstanding on my part of how overloaded methods work.

Context

In order to conform to RandomAccessCollection, Zip2Collection (part of a proposal I'm working on) needs to (1) provide an Index that conforms to Strideable or (2) provide O(1) implementations of distance(from:to:) and index(_:offsetBy:) (on top of the requirements set by Collection and BidirectionalCollection). All other implementations, like count, index(_:offsetBy:limitedBy:), etc., are built on top of these. I've opted for the second strategy.

During benchmarking I discovered that the default implementation of count defined in Collection doesn't make use of our overloaded O(1) implementation of distance(from:to:) when using two RandomAccessCollections. It makes use of the default O(n) implementation from Collection.

Steps to Reproduce

  1. Create a local copy of the code:
    cd ~/Desktop && \
    git clone https://github.com/dennisvennink/Zip.git && \
    cd Zip && \
    git checkout 8031b4a00cc1394613a2cd226e19e73f247857ed && \
    swift package generate-xcodeproj && \
    open Zip.xcodeproj
    
  2. Comment out count in Sources/Zip/Zip2Collection.swift.
  3. Run the testCount() benchmark in Tests/ZipTests/Zip2CollectionBenchmarks.swift.
  4. Uncomment count.
  5. Run the benchmark again.

I see a huge difference in time performance (2000000% worse) on my machine (2,4 GHz Intel Core i7 MacBook Pro (17-inch, Late 2011) running macOS High Sierra 10.13.5 (17F77), Xcode 10.0 beta 6 (10L232m) and Swift 4.2 (swiftlang-1000.0.36 clang-1000.0.4)).

Workarounds

  1. Move distance(from:to:) and index(_:offsetBy:) to Collection.
  2. Reimplement count (and all other methods). ¯\_(ツ)_/¯

It might be that I'm overlooking something, hence not directly filing a JIRA.

Any thoughts are welcome.


(Ben Cohen) #2

@moiseev and @nnnnnnnn have bumped up against this a lot in their efforts and may have some advice to share.


(Nate Cook) #3

One thing to try would be to move your distance(from:to:) implementation into an unconstrained extension. It looks like everything in there should work properly for just a basic Collection, and then you'd get the dispatch you're hoping for.

I think the one change you'll need to make is to add this at the beginning, so that non-bidirectional collections don't crash:

if end < start {
    return -distance(from: end, to: start)
}

(Dennis Vennink) #4

That should work, yes.

This post from last week by @anthonylatsis contains an example that is very similar (doo is analogous to count and foo is analogous to distance(from:to:)).

I'll look into it. I think everything should get handled (tests are here, here, here and here).

Thank you.


(Nate Cook) #5

This is kind of what you're running into, because of the conditional conformance to RandomAccessCollection. Both count and distance(from:to:) are Collection protocol requirements, so you'd expect that your specialized implementations would always get called, even in generic contexts. However, because your specializations are conditionally constrained to when your collection is random access, they aren't called in a context that only knows about Collection conformance.


(Dennis Vennink) #6

Perfect. It all makes sense now.

Maybe we should add a caveat to the documentation (in bold)?

Conforming to the RandomAccessCollection Protocol

The RandomAccessCollection protocol adds further constraints on the associated Indices and SubSequence types, but otherwise imposes no additional requirements over the BidirectionalCollection protocol. However, in order to meet the complexity guarantees of a random-access collection, either the index for your custom type must conform to the Strideable protocol or you must implement the index(_:offsetBy:) and distance(from:to:) methods with O(1) efficiency.

Note: If the conformance to RandomAccessCollection has additional conditional constraints over protocols lower in the hierarchy, these methods should be put into an unconstrained extension.

(Or something similar.)


(Goffredo Marocchi) #7

nnnnnnnn Nate Cook
September 11
dennisvennink:
This post from last week by @anthonylatsis contains an example that is very similar ( doo is analogous to count and foo is analogous to distance(from:to:) ).

This is kind of what you're running into, because of the conditional conformance to RandomAccessCollection. Both count and distance(from:to:) are Collection protocol requirements, so you'd expect that your specialized implementations would always get called, even in generic contexts. However, because your specializations are conditionally constrained to when your collection is random access, they aren't called in a context that only knows about Collection conformance.

Thank you for your explanation :). Quite informative concise reference to dispatch rules and conditional conformance.


(Nate Cook) #8

That's a good idea! Filed SR-8728.


(Dennis Vennink) #9

Found another gotcha:

let xs: Zip2Collection<[Int], [Int]> = zip([0, 1, 2], [0, 1, 2])
var index = xs.endIndex

_ = xs.formIndex(&index, offsetBy: -1, limitedBy: xs.endIndex)
// Fatal error: Only BidirectionalCollections can be advanced by a negative amount

It calls Collection.formIndex(_:offsetBy:limitedBy:) (which is fine) which should internally call BidirectionalCollection.index(_:offsetBy:limitedBy:), but instead calls Collection.index(_:offsetBy:limitedBy:) (which traps for distance < 0).

Since Zip2Collection is a BidirectionalCollection where Collection1: RandomAccessCollection, Collection2: RandomAccessCollection this makes sense.

What is the right course of action here? Override every required member in Collection that provides a default implementation? Go through the stack trace / source of each of these members to see if there are more that cause issues? Is this documented anywhere?


(Jordan Rose) #10

The required members that need to be overridden are the ones that offer more functionality than the base implementation. Otherwise, it's just a difference in performance.