Adding more data structures to the standard library

WeakHashMap Can be easily implemented using a weak box to wrap keys or values within a regular Dictionary

I didn't consider this a possibility because the other requirements of a Java weak map. For example:

  • The values need to potentially be dereferenced/deallocated when the keys go away
  • Iteration cannot break if such changes happen due to actions on another thread.

IdentityHashMap Can easily implement using a weak box to wrap keys within a regular Dictionary :

Yes, with an extra step to box/unbox to use. It does have the advantage over java that you can control whether null keys/values are allowed.

It's interesting to note that IdentityHashMap is a linear probe algorithm, and not a bucket-chain like normal java HashMaps.

P.s. Can you please edit your table to add queue, deque, and stack, for sake of completeness?

As I noted, I left them all off because the whole category has no real Swift equivalents to the Java API. In particular, a lot of the Queue/Deque diversity in Java is based on the early encouragement to use many blocking threads for processing (such as creating a reading and a writing thread for each socket).

If you ignore such concurrency-related methods and go for a simpler API, you can use Array as a Queue. You can use Array directly as a Deque, or wrap it with a more efficient implementation that doesn't necessarily require a full new array/copy to push/pop off the front.

Other options in Java like priority deques don't have an easy corresponding underlying implementation (well, NSSortedMap).

For those reasons I punted on representing it in the table. And punting on representing it was why I made the table and my comments on the table separate posts :-)

If you suggest what to put (Queue/Deque interfaces to nothing on Swift side?), I'll edit the post.

Just my 2-cents: coding challenges are not the best optimization target for a language.

That said, I've recently run across a couple use-cases where it would be super useful to have alternatives to Dictionary for key-value storage for performance reasons, so I would be happy to have alternatives.

I don't fully understand CopyOnWriteArray(Set|List) , but I suspect their use cases are already covered by our value typed, CoWable Array and Set .

Oops, I missed this question.

These are both concurrency-oriented types. CopyOnWriteArrayList is basically Array in Swift. There is no optimization in Java to mutate rather than copying if you are operating on the only reference (because the GC doesn't expose that information).

CopyOnWriteArraySet is a wrapper around the list type. There is no guarantee of order, I suspect it is actually insertion order with O(n) scan for equality on contains/add/remove operations.

Both are meant to be inefficient in exchange for minimal locking. Probably the closest Java has to persistent data structures.

There is no concurrency-oriented hashed set in Java; you would likely use the concurrent hash map with null values for that.

Doesn't Swift code do that too? I know NIO is non blocking, but there's way more IO to worry about than just networking. I've written blocking code before using readline, was there something non-blocking I could have used?

You really shouldn't

This is basically mandatory for any non-toy example.

Do these data structures need "an easy corresponding underlying implementation" to be considered? I see that as a non-issue. If it doesn't exist, we can write it

I would prefer it, is shows a decent summary of Swift's state of affairs, and act's as a "TODO", or at the minimum "to be considered".

Wow, yep, it's linear searching for elements. Here's the JDK 9 implementation.

I've run into that a while ago. It seemed like a strange hole, given how easily it is filled (just wrap a ConcurrentHashMap<K, Void>)

1 Like

It's definitely not the highest priority use case, in that we shouldn't be optimizing for it at the cost of other use-cases, but I find it to be incredibly important.

Code problems are a really natural way for people to introduce themselves to a language. It's a frustrating user experience if they learn that batteries aren't included, and worse, that batteries aren't available:

  • either because the code problem site doesn't support adding dependancies (I haven't seen one that does)
  • or because they don't know about SPM/Cocoapods/Carthage. We can't expect people to have to dive into this deep-end just for an efficient Deque, or some other basic data structure.

Also, a large portion of my day-to-day usage of Swift is writing small scratch files. Small scripts, quick little algorithm tests, stuff like that. Usually with the aim of prototyping an idea in Swift (where I'm fast and proficient) prior to implementing it in a less familiar language. These contexts make it hard to use anything besides the Swift and Foundation modules.

2 Likes

How about adding all of these to another package? Do they really need to be added to the standard library? I do agree that they should be avalible in a blessed way in one of Apple’s repos like the protobuff support is currently vended.

If it's as ubiquitous as Foundation, and just requires import DataStructures, I'm all for it.

But I don't think, under any circumstance, that it should require a package manager, an Xcode project, or other "scaffolding". It should be batteries-included, whether in a plain file.swift, a playground, REPL or online code problem platform.

5 Likes

This is an open problem. Same as the python module. Upstreaming the Python module

I use Sets. Quickest way of de-duping a collection of values without any effort.

2 Likes

Sure, if you don’t need to preserve the order of elements.

1 Like

I've not yet encountered a situation where I wanted to preserve the order of a collection's elements but ignore subsequently repeated values of elements - in a simple task.

As part of a more complex algorithm I'd just iterate over the original collection, adding new elements to a 'parsed' Set and skipping those already present.

It's quite trivial to deduce an array while keeping elements, using a Set. But for some reason, very little people on StackOverflow ever show that.

Can you show that? Sets are just Maps with nil values right (said all golang users)?

1 Like
extension Sequence {
    /// Returns an array containing, in order, the first instances of
    /// elements of the sequence that compare equally for the keyPath.
    func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> [Element] {
        var unique = Set<T>()
        return filter { unique.insert($0[keyPath: keyPath]).inserted }
    }
}
3 Likes

Probably GCD Dispatch IO? https://developer.apple.com/documentation/dispatch/dispatchio

Dispatch I/O
An object that manages operations on a file descriptor using either stream-based or random-access semantics.

In a mini project i did one day for spending time i solved this problem by cleaning up the collection on next access. Not the best in terms of performances. :D

here

I have an idea for another way to do it. You create an internal class Key<K: Hashable> that wraps the key, and a reference to the Dictionary it's about to be stored in. Upon deallocation, you access the dict and nil out the associated value.

Your WeakHashableContainer can have different hash values for objects that are equal, which is illegal for Hashable and will leave your Dictionaries very confused :[

var a: NSObject? = NSObject()
var b: NSObject? = NSObject()

let aContainer = WeakHashableContainer(withObject: a!)
let bContainer = WeakHashableContainer(withObject: b!)

a = nil
b = nil

print(aContainer.hashValue == bContainer.hashValue) // false
print(aContainer == bContainer) // true

This would be a mutation of the underlying dictionary heap object, I assume? Then, I assume the blank entry stays until a mutation causes a resize event?

Terms of Service

Privacy Policy

Cookie Policy