Inclusion of persistent collections with COW support as part of a standard library

code on github

Working code exists: A Persistent Vector, Persistent Hash Set, and Persistent Hash Map, all with COW support.

COW makes persistent collections particularly nice to use in Swift because support for a mutating API is possible in a safe and sane way without creating shared mutable state.

Plus, the mutated copies of persistent collections in case a collection is shared before local mutation are extremely cheap. In the non-shared case, direct manipulation is performed, just as in the case of the built in collection types.

The concurrency tag was added because of the usefulness of persistent collections for shared immutable state without a performance penalty in the case of local mutation.

1 Like

What do you mean by “persistent”? Is the data kept in nonvolatile storage?

Also, why is PVNode implemented as a class with fatalError() everywhere, rather than as a protocol?

Persistent Collections do not persist data in storage. They internally use trees as their data structures and can cerate a mutated copy by only replacing the path from root to the affected branch, retaining everything else, which makes mutating copies extremely cheap. The name "persistent collections" is unfortunate but was already there...

I really wanted to go for a protocol but the language just didn't let me. The problem is that protocols aren't real types, so they cannot be held by e.g. an array. Protocols are not like Java interfaces, which are real types. Protocols are really only useful for defining generic constraints. At least it used to be that way in swift 3. An abstract class would be the way to go, but what can you do... Objective C compatibility kind of enforces some restrictions in swift, or at least lifting them would require too big an effort, I guess...

You can...

In Swift:

protocol Node {
	var root: Node? { get }
}

class CyclicNode: Node {
	var root: Node? {
		return self
	}
}

var nodes: [Node] = [CyclicNode()]
for node in nodes {
	print(node.root)
}

In ObjC:

@protocol Node
@property (readonly, nullable) id<Node> rootNode;
@end

@interface CyclicNode: NSObject <Node>
@end

@implementation CyclicNode
-(id<Node>)rootNode {
	return self;
}
@end

NSArray<id<Node>> *nodes = @[ [[CyclicNode alloc] init] ];
for (id<Node> node in nodes) {
	NSLog(@"%@", [node rootNode]);
}

I am not sure how well does Swift export such things into ObjC, but I'd suggest to write this in ObjC if ObjC compatibility is required.

Now make Node generic :)

I think things fell down in my exploration tests as soon as the protocol was generic. You can't have a [Node] if Node is a protocol.

I do have a C++ implementation, on which I have based the Objective C version of the same thing (and the python version, too). The Java and C# are standalone.

I meant swift's Obective C compatibility in general...

Only the swift version has COW implemented (Maybe I'll add that in C++ later on, it is possible to find out whether a node is shared in a thread-safe way)

Got it. The associated type is the show stopper. It really doesn't work.

protocol Node {
    associatedtype E
}

var x: [Node]? = nil // error:

Protocol 'Node' can only be used as a generic constraint because it has Self or associated type requirements

Just as back in 2016 when I wrote the first version of it (only then the compiler was a bit crashy when facing typealiases and associated types in protocols...

The standard library uses type-erased wrappers (eg. AnyCollection) for that.

That sounds terrible... this way you pay the kind of performance & space penalty that you do in Java - in a language that has strong static generic type support...

I really prefer using generics they way they're supposed to be used and pay the tiny, yet ugly price of having to fake an abstract base class :grimacing:

2 Likes

Hello Sebastian,

We are many struggling with Swift subtleties. But since you ask for inclusion of your code into the standard library, you might understand why interested readers that happen to be familiar with the stdlib evaluate how well your code fits with the rest of that stdlib.

You surely agree that consistency in the standard library patterns is a quality. If AnyStuff type erasers are part of standard patterns, then so be it. It's unlikely to change soon.

2 Likes

That is most certainly true. A version for inclusion would of course need to be modified this way. That would also open an interesting opportunity to actually performance test the two variants against each other :blush:

I was just explaining the rationale behind the decision I made when I created this version of the collections :blush:

Yes. I guess the Swift core team had to chose between many excellent and competing rationales until they picked the ones we have today. They're not that bad. Even if they're somewhat unusual.

Type erasure is well known territory, given its widespread use in java. That makes it not that unusual at all :blush:

My version was created as a port from C++, that shows in these details. COW was added in swift.

It went Java -> (C# and C++) -> Swift

I also use the C++ version as the core for Objective C version (with id as the C++ template arguments) and the Python version (with PyObject *) :blush:

I'll create a version with type erasure in a different repo, incorporating the current swift core style, and report back :blush: Thanks a lot for your valuable feedback!

1 Like

Most of the software craftsmanship principle we try to reach juniors would tell us about iterative improvement not maintaining a status quo if a better solution is found. Question is, does this bring us forward or not? Do we trust the language not to evolve in a series of completely opposite direction/patterns and create further problems?

That's right, Goffredo. The current shape of the standard lib has a few experimental spots (String, Codable, protocols, ABI, others) that are under active evolution, guided by core "manifestos", implemented by narrower "proposals", where "better solutions" are constantly discussed and compared.

Some other kinds of "better solution" would require a near-complete redesign of large parts of the stack. Or a separate library (Large Proposal: Non-Standard Libraries). Which would require a tremendous team effort to match the quality of the current state of the ecosystem Swift has built so far.

The price of admission is, for understandable reasons, raising.

I guess this kind of a general change would need to be driven by the development of the language itself. Once there are abstract and maybe pure abstract base classes (interface like), that would probably trigger a lot of change in the libraries. Until then, type erasure is not likely to be replaced throughout the code base, imo.

For information, abstract classes have been deferred: proposal, decision rationale.

They may ship eventually. I personally hope they do, because I, sometimes, write abstract classes (guarded with fatal errors). Rarely... when they happen to be the best choice :-)

Type erasure might more likely be replaced with generalized existential's. I think the plan is to typealias the Any* family. From the generics manifesto:

typealias AnySequence<Element> = Any<Sequence where .Iterator.Element == Element>
4 Likes