I think we should have a protocol for dictionaries. There are already several types that could conform.
Dictionary<Key, Value> // from swift standard library
NSDictionary // from Foundation
NSCache<Key, Value> // from Foundation
NSMapTable<Key, Value> // from Foundation
OrderedDictionary<Key, Value> // from Swift Algorithms
Hi @aggie33 – the key when making a case for a new protocol like this is to show useful code that is generic over all these types.
What kind of operations on a generic protocol that lets you map a key onto a value would you write for all these types, if you had such a protocol? Are there equivalents of Sequence.map or MutableCollection.shuffle that could apply to all the potential conforming types you list?
One concrete use case I've run into is to allow customization of my group(by:) algorithm to allow callers to choose to use OrderedDictionary, or others:
There’s a simpler solution to that, which is to just stick with the initializer versions instead, which works today without a protocol. Creating a protocol purely to allow sugaring initializers as chained methods seems insufficient justification.
This is a bit of an anti-example, because this would not be a good idea. Implementing this via a subscript get/set defeats copy-on-write, so values[grouping, default: []].append(element) becomes a linear operation. Even once _modify becomes a full language feature, the implementation for modifying an element, creating a default initially if it isn’t in the dictionary, may not have a straightforward default implementation – determining that needs some experimentation.
This and your filter example also raise another question that would need answering: is this one protocol, or a hierarchy of protocols. To implement filter requires multiple things:
the ability to iterate (which could be done by also requiring Sequence possibly)
the ability to create a new empty instance of the dictionary
the ability to write values for keys as well as read them.
In the Collection hiearchy, these are all modeled by separate protocols.
Sequence for looping over values – should DictionaryProtocol refine it?
Collection for indexed subscript, equivalent to lookup up a value for a key
MutableCollection for changing elements at an index, equivalent to mutating a value for a key
RangeReplaceableCollection for (amongst other things) creating empty collections and also adding new elements to collections.
So is this really a hierarchy of protocols? If yes… this becomes even more complex addition, and so in turn needs more justification to ensure that complexity is useful. If no, that keeps the protocol nice and simple but at the cost of potentially ruling out conformance of some types to this protocol.
There’s also the question of whether standard algorithms on the protocol should be customization points. For example, a win with Dictionary.filter is that it can potentially do it more efficiently, using knowledge of the existing keys and their bucketing. This would only work if filter were a customization point. But if every useful algorithm on a protocol ends up needing to be a customization point, that’s not very scalable. This isn’t true of the collection hierarchy, but it might be of a dictionary protocol.
All this is to point out that what seems like a simple ask (just add a unifying protocol for all these similar things) can end up being a deceptively complex task
to be fair, this is a rather severe limitation of swift protocols that affects a lot of other use cases besides the pitched DictionaryProtocol. i would rather we prioritize fixing the underlying problem (which would probably either involve named subscripts or the ability to declare _modify as a protocol accessor requirement), than see it be used as a justification to shoot down ideas like DictionaryProtocol.
Defeating CoW is a much bigger deal for collections than is not having _modify made official for, say, integer types. It is a fair critique of a proposal if its centerpiece use case is one that cannot be made performant in the absence of future unrelated features.
I'm not sure this would really make such a great protocol.
If you look at the types which you suggested might conform:
Dictionary<Key, Value> // from swift standard library
NSDictionary // from Foundation
NSCache<Key, Value> // from Foundation
NSMapTable<Key, Value> // from Foundation
OrderedDictionary<Key, Value> // from Swift Algorithms
They all seem to be associative arrays, and so their APIs can superficially appear similar, but they also have wildly different behaviour which makes it difficult to write generic algorithms using them.
(NS)Dictionary is unordered. If you're copying its key-value pairs in to another data structure that isn't a dictionary, you will almost certainly want to sort them first.
OrderedDictionary is ordered. If you're copying its key-value pairs in to another data structure that isn't a dictionary, you should not sort them first, since the existing order is actually meaningful.
NSCache can drop values behind your back according to system resource pressure, so you can never really rely on it having any particular contents. Presumably for that reason, you can't iterate its contents, and it doesn't even have count or isEmpty properties. It is used in completely different contexts to the other types.
Much of NSMapTable is obsolete these days. Storing arbitrary pointers was only a useful feature because NSDictionary required keys and values to be subclasses of NSObject (a limitation which Swift's native Dictionary does not have) and most of NSPointerFunctions is necessarily unsafe.
There may be some value in associating a collection of weak references with a key, but even then I think it would be better modelled as a Dictionary<Key, WeakCollection<MyObject>> rather than having the whole table be mutated from any thread any time an object gets deallocated. It might be worth a specialised type.
Even then, it would suffer the same problem as NSCache that the contents change behind your back. Meaning I think such map-tables fill a similar role to caches in that they store ephemeral data, and aren't really used in the same contexts as the other dictionary types.
As swift-collections adds more specialised dictionary implementations, they might want to settle on some common API surface based on some expected generic algorithms (or not), but I think that is best done in the collections library at least for now.
just to be clear, the ask is not to make _modify “official” (it is already de facto official, the only kind of official that matters), the ask is to make it possible to declare an accessor requirement of _modify, one that cannot be witnessed by a set.
Can you expand on this, what exactly are you doing here and what is the issue. The protocol talks about a dictionary subscript and you are showing a seemingly unrelated Array's append operation. What's "grouping" here? Please show the whole example.
class IntRef {
var value: Int
init(_ value: Int) {
self.value = value
}
}
struct IntVal {
private var ref: IntRef
init(_ value: Int = 0) {
ref = IntRef(value)
}
var value: Int {
get { ref.value }
set {
guard isKnownUniquelyReferenced(&ref) else {
print("COW")
ref = IntRef(newValue)
return
}
print("NO COW")
ref.value = newValue
}
}
mutating func append(_ element: Int) {
value += element
}
}
extension [Int: IntVal] {
subscript(newSubscript key: Key) -> IntVal {
get { self[key]! }
set { self[key] = newValue }
}
}
protocol DictionaryProtocol {
subscript(protocolSubscript key: Int) -> IntVal { get set }
}
extension [Int: IntVal]: DictionaryProtocol {
subscript(protocolSubscript key: Key) -> IntVal {
get { self[key]! }
set { self[key] = newValue }
}
}
func test_builtinSubscript() {
var values: [Int: IntVal] = [0 : IntVal(1)]
values[0]!.append(0) // NO COW
}
func test_newSubscript() {
var values: [Int: IntVal] = [0 : IntVal(1)]
values[newSubscript: 0].append(0) // COW
}
func test_existential() {
var values: DictionaryProtocol = [0 : IntVal(1)]
values[protocolSubscript: 0].append(0) // COW
}
func test() {
test_builtinSubscript() // NO COW
test_newSubscript() // COW
test_existential() // COW
}
test()
I can see COW triggered for the existential case as you describe
However I also see it being triggered when I declare my subscript, which doesn't go through a protocol conformance Why would it be? So it's not just about going through protocol / existential which is problematic but the problem is more profound?
Edit:
I see. This fixes the COW issue with my custom subscript:
var a = values[0]!
a.append(0) // this should COW âś…
print(values[0]!)
values[0]! = a
var a = values[0]!
a.append(0) // but why this COW-ing? 🤔
values[0]! = a
BTW, why is that COW-ing? Can't compiler see that the original array is not used but overwritten right away?
The workaround (besides using _modify instead of set) could be resetting the element to nil (if it's optional) or to some placeholder value (if it's non optional):
let placeholder = IntVal()
var a = values[0]!
values[0] = placeholder
a.append(0) // NO COW
values[0] = a
the compiler does not know that the original array is overwritten after the append, unless the entire call-chain of the subscript is @inlinable to the caller. (or if you use the very dangerous @_semantics annotations.)