Adding more data structures to the standard library

I use Sets. Quickest way of de-duping a collection of values without any effort.

2 Likes

Sure, if you don’t need to preserve the order of elements.

1 Like

I've not yet encountered a situation where I wanted to preserve the order of a collection's elements but ignore subsequently repeated values of elements - in a simple task.

As part of a more complex algorithm I'd just iterate over the original collection, adding new elements to a 'parsed' Set and skipping those already present.

It's quite trivial to deduce an array while keeping elements, using a Set. But for some reason, very little people on StackOverflow ever show that.

Can you show that? Sets are just Maps with nil values right (said all golang users)?

1 Like
extension Sequence {
    /// Returns an array containing, in order, the first instances of
    /// elements of the sequence that compare equally for the keyPath.
    func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> [Element] {
        var unique = Set<T>()
        return filter { unique.insert($0[keyPath: keyPath]).inserted }
    }
}
3 Likes

Probably GCD Dispatch IO? DispatchIO | Apple Developer Documentation

Dispatch I/O
An object that manages operations on a file descriptor using either stream-based or random-access semantics.

In a mini project i did one day for spending time i solved this problem by cleaning up the collection on next access. Not the best in terms of performances. :D

here

I have an idea for another way to do it. You create an internal class Key<K: Hashable> that wraps the key, and a reference to the Dictionary it's about to be stored in. Upon deallocation, you access the dict and nil out the associated value.

Your WeakHashableContainer can have different hash values for objects that are equal, which is illegal for Hashable and will leave your Dictionaries very confused :[

var a: NSObject? = NSObject()
var b: NSObject? = NSObject()

let aContainer = WeakHashableContainer(withObject: a!)
let bContainer = WeakHashableContainer(withObject: b!)

a = nil
b = nil

print(aContainer.hashValue == bContainer.hashValue) // false
print(aContainer == bContainer) // true

This would be a mutation of the underlying dictionary heap object, I assume? Then, I assume the blank entry stays until a mutation causes a resize event?

Personally I think we should also add Matrix<Element> type to Swift, because more and more newer techs like AR require matrix calculations so it'd be very convenient if we can have native matrix type support.

Of course you can create a matrix from Array type but it's not secure (e.g. even if you access a columnID that doesn't exist, as long as rowID * columsCount + columnID exists you get the result)

1 Like

just had to change the equality condition. Thank you for spotting this

    public static func ==(lhs: WeakHashableContainer<T>, rhs: WeakHashableContainer<T>) -> Bool {
        switch (lhs.weakReference, rhs.weakReference) {
        case let (.some(a), .some(b)): return a == b
        case (.some, .none), (.none, .some): return false
        case (.none, .none): return lhs.hashValue == rhs.hashValue
        }
    }

I think I might even compare just the hashValues... but somehow I feel it's wrong.

1 Like

Maybe we should open a discussion thread for each data structure? :) I feel like the thread is branched. :stuck_out_tongue:

3 Likes

it's the "upon deallocation" the problem. I might be wrong, but I didn't find any way to "listen" to object deallocation and react to it.

I misspoke by calling it a Dictionary, it would be your WeakDictionary type, and for this to work, it would need to be a class (so that every WeakKeyWrapper can capture a shared reference to it).

IDK how dict handles deletion. I would have thought that assigning nil for a given key is special cased to trigger deletion, and doesn't literally store nil.

Ah never mind, I guess the key would get released, not the key wrapper, so you can't handle the removal in the key wrapper's deinit. Nevermind :frowning:

The best I could do to get rid of the value upon key deallocation was the cleanup method call on each collection access... But this adds a tremendous O(n) overhead on each access to the collection.

I'm currently using that data structure but, being aware of the constraint, I use it for small collections to handle multicast delegate or coordinator patterns.

The behaviour of NSMapTable was not good enough in my use case (cleanup upon collection resize... depending on the usage this might even never happen and the value would leak for the entire app lifecycle)

If you really want to use that & you are Ok with being limited to Apple's platforms you can register for a dispatch memory pressure event, and do a cleanup on all of the weak dictionaries on memory pressure events.

It would "work" on macOS, but not be great (it could force page-ins during a memory pressure event). On iOS/tvOS it would work well (no page-outs on those systems, so memory pressure events are really "running low on RAM, will kill a process if things get worse")

This approach is likely acceptable for applications on iOS, but wouldn't be something we could use in daemons in general. In order to avoid "thundering herd" problems, the kernel doesn't wake up everything at once to receive memory pressure notifications, so by the time a process receives the warning, it's likely multiple other daemons would have been killed already. In general we've found that userspace memory notifications are much harder to make good use of than most people expected.

Of course this does leave us with the dilemma of how to implement weak collections usefully. Back when we were doing tracing garbage collection for ObjC, there was a "weak callback" system that allowed collections to update when the weak reference was cleared, but it was pretty tricky to use well. With ARC weak references the best available scheme was clearing things out on mutation, as has been mentioned here.

I haven't thought this through yet, but it's at least theoretically possible that Swift ARC could do better than ObjC ARC for this due to the different structure for weak references. It would need runtime changes at the very least though.