Inclusion of persistent collections with COW support as part of a standard library

Got it. The associated type is the show stopper. It really doesn’t work.

protocol Node {
    associatedtype E
}

var x: [Node]? = nil // error:

Protocol ‘Node’ can only be used as a generic constraint because it has Self or associated type requirements

Just as back in 2016 when I wrote the first version of it (only then the compiler was a bit crashy when facing typealiases and associated types in protocols…

The standard library uses type-erased wrappers (eg. AnyCollection) for that.

That sounds terrible… this way you pay the kind of performance & space penalty that you do in Java - in a language that has strong static generic type support…

I really prefer using generics they way they’re supposed to be used and pay the tiny, yet ugly price of having to fake an abstract base class 😬

2 Likes

Hello Sebastian,

We are many struggling with Swift subtleties. But since you ask for inclusion of your code into the standard library, you might understand why interested readers that happen to be familiar with the stdlib evaluate how well your code fits with the rest of that stdlib.

You surely agree that consistency in the standard library patterns is a quality. If AnyStuff type erasers are part of standard patterns, then so be it. It’s unlikely to change soon.

2 Likes

That is most certainly true. A version for inclusion would of course need to be modified this way. That would also open an interesting opportunity to actually performance test the two variants against each other 😊

I was just explaining the rationale behind the decision I made when I created this version of the collections 😊

Yes. I guess the Swift core team had to chose between many excellent and competing rationales until they picked the ones we have today. They’re not that bad. Even if they’re somewhat unusual.

Type erasure is well known territory, given its widespread use in java. That makes it not that unusual at all 😊

My version was created as a port from C++, that shows in these details. COW was added in swift.

It went Java -> (C# and C++) -> Swift

I also use the C++ version as the core for Objective C version (with id as the C++ template arguments) and the Python version (with PyObject *) 😊

I’ll create a version with type erasure in a different repo, incorporating the current swift core style, and report back 😊 Thanks a lot for your valuable feedback!

1 Like

Most of the software craftsmanship principle we try to reach juniors would tell us about iterative improvement not maintaining a status quo if a better solution is found. Question is, does this bring us forward or not? Do we trust the language not to evolve in a series of completely opposite direction/patterns and create further problems?

That’s right, Goffredo. The current shape of the standard lib has a few experimental spots (String, Codable, protocols, ABI, others) that are under active evolution, guided by core “manifestos”, implemented by narrower “proposals”, where “better solutions” are constantly discussed and compared.

Some other kinds of “better solution” would require a near-complete redesign of large parts of the stack. Or a separate library (Large Proposal: Non-Standard Libraries). Which would require a tremendous team effort to match the quality of the current state of the ecosystem Swift has built so far.

The price of admission is, for understandable reasons, raising.

I guess this kind of a general change would need to be driven by the development of the language itself. Once there are abstract and maybe pure abstract base classes (interface like), that would probably trigger a lot of change in the libraries. Until then, type erasure is not likely to be replaced throughout the code base, imo.

For information, abstract classes have been deferred: proposal, decision rationale.

They may ship eventually. I personally hope they do, because I, sometimes, write abstract classes (guarded with fatal errors). Rarely… when they happen to be the best choice :-)

Type erasure might more likely be replaced with generalized existential’s. I think the plan is to typealias the Any* family. From the generics manifesto:

typealias AnySequence<Element> = Any<Sequence where .Iterator.Element == Element>
4 Likes

Has anybody got an idea how to make

let unshared = isKnownUniquelyReferenced(&root)

work when root is of a protocol type extending AnyObject?

Computer says no:

Cannot invoke ‘isKnownUniquelyReferenced’ with an argument list of type ‘(inout PVNode?)’

with

protocol PVNode : AnyObject { ... }

and

var root: PVNode?

it worked fine when root was of type PVNode? and PVNode was a class…

1 Like

The details are more complicated. Swift supports separate compilation in addition to values conforming to protocols. This means that you must have some universal representation to operate on, as a public function cannot see all callers or their types. Of course, for internal or intra-module calls/types, the compiler aggressively specializes and de-virtualizes when possible (accounting for code-size tradeoffs).

Swift’s existential containers have inline buffers to store small values directly, say 2 or 3 pointers in size. Also, there’s been continual, active work (CC @Arnold) to stack allocate more and more language constructs.

More details on the implementation approach are at https://www.youtube.com/watch?v=ctS8FzqcRug
There’s also some (dated) info in the latter half of https://developer.apple.com/videos/play/wwdc2016/416/

1 Like

@sbohmann, as someone who has implemented persistent collections in the past I am unsure which functionality would not be supportable via the existing classes. Would the goal for these implementations be to offer non-functional (performance/memory) benefits?

To be more specific in my question - should Dictionary hypothetically be replaced in Swift 5 with a HAMT implementation (with allowances for a more traditional hashing strategy), would this result in its public API changing? My suspicion was that this would be transparent (barring side effects like iteration order)

The existing Array, Dictionary, and Set are not implemented using typical persistent collection algorithms today, but have immutability via CoW.

It is kinda terrible, but it is a limitation until we have generalized existentials.

So you know, type erasure is not the same in Swift as in Java. Type erasure is done by capturing the use of the concrete type within a function. So for example, AnySequence works by proxying the makeIterator() call - and to save space has implementations of all the other Sequence methods based on that call.

Wow, thanks for the additional info and the links, will read back them at home 😊

The difference is in the case of shared large collections. Copy on write did in my tests become very costly in this case. For small collections, I of course expect the existing built in array and dictionary to be way faster, given their relative algorithmic simplicity and massive optimization 😊 I would therefore and generally not propose replacing anything.

The use case is not only concurrency but every case with large statelessly shared collections. It doesn’t take concurrency or coroutines to make them necessary; stateless sharing is simply a way to go about things that has become quite popular because of the way it reduces overall complexity.

Swift’s classic collections with their COW support directly play into avoiding shared state. It’s just that the performance can really degrade for large collections, which is why all most programming languages supporting functional programming now have them 😊

Terms of Service

Privacy Policy

Cookie Policy