Understanding the internals of the Set type

I am hoping to learn a little more about the internals of the Set type: swift/Set.swift at main · apple/swift · GitHub

Some guidance on the below topics would be great:

  • Looks like the worst case runtime to make a Set from an Array is O(n). Is my understanding correct?
  • I see that Set uses VariantSetBuffer and NativeBuffer under the hood. Where can I learn more about these two? I have looked around, but I have been coming up short. Maybe I am missing something here.


1 Like

Do you have any specific questions? The declarations are in the file you linked to.

NativeSetBuffer (and RawNativeSetStorage) contains the Swift Set implementation:

VariantBuffer is a union of the potential *SetBuffers that might be providing the Set's backing storage. On non-ObjC platforms, there is a single case: .native(NativeSetBuffer). On Obj-C platforms, the Set might have been bridged from an NSSet, so we gain an additional case: .cocoa(CocoaSetBuffer).

1 Like

Ahh, I totally missed that they are defined in the same file, thanks!!

I don't have any specific questions at the moment. I am just trying to dig into the implementation details to better understand the runtime complexity of the different Set operations. This will definitely help!

Any thoughts on my first question? If not, no worries :).

It sure looks like it, since the initializers copy the array into their own buffer.

At least. Going from non-unique -> unique is going to be quadratic in the worst-case, because you have to test every element against every element you saw before. Actual performance will depend on the quality of your data's hash-values.

If a unique hash can be calculated in constant time (best case), the entire operation will be linear. If your data takes a long time to hash, or produces poor hash values which frequently collide, you will get more quadratic behaviour (worst case).

1 Like