[Announcement] Planning for Swift Collections v1.0

[Announcement] Planning for Swift Collections v1.0

TLDR: We'd like to officially declare Swift Collections source stable and tag version 1.0 of it in the next week or two. Please speak up now if you have something you'd like to see changed in Deque, OrderedSet or OrderedDictionary -- this is the last opportunity to (easily) fix any API issues.

Swift Collections 1.0

Swift Collections currently declares its entire API unstable. However, as it gets adopted by more and more projects, it is becoming increasingly unlikely we would be able to make source breaking changes in future releases. It makes sense to accept this fact and declare the package source stable now.

Our plan is to tag version 1.0 of swift-collections in the next week or two. I have already created a branch for the new release (release/1.0), using the latest 0.0.5 tag as the starting point.

The 1.0 release will ship with the original three collection types, Deque, OrderedSet and OrderedDictionary.

What do we mean by source stability?

We expect to observe the rules of Semantic Versioning. Changes to the public API of the package that are expected to cause build failures in adopters can only be shipped in a new major release. The same goes for behavioral changes that are likely to change the meaning of existing code.

Note: Pretty much every change can technically break someone, so this isn't an exact science. At the end of the day, whether a change introduces acceptable risk boils down to a judgment call made by the package maintainers, based on information supplied by the community. When we get it wrong, I expect we will be able to correct mistakes after the fact by tagging a subsequent new release.

What's considered public API?

The public API of version 1.0 of the swift-collections package will consist of non-underscored declarations that are marked public in the following modules:

  • Collections
  • DequeModule
  • OrderedCollections

For a list of public APIs defined by each public type, see the documents in the Documentation folder of the repository.

Note that no part of the test suite nor benchmarks are included in the package's public interface -- these may continue to change at whim. (This includes the entire contents of the CollectionsTestsSupport module.)

By "underscored declarations" we mean declarations that have a leading underscore anywhere in their fully qualified name. For instance, here are some names that wouldn't be considered part of the public API, even if they were technically marked public:

  • OrderedCollections.OrderedSet._minimumCapacity(forScale:)
  • OrderedCollections._Hashtable.scale
  • _HashSupport.HashTable
  • DequeModule.Deque.init(_storage:)

Underscored APIs may continue to change in any release, including patch releases.

Future minor versions of the package may introduce changes to these rules as needed.

How often will we tag a new major release?

Given that this is a core package that is likely to become a dependency of many projects, I expect we will rarely if ever be able to release new major versions of it.

The stated intention of this package is to serve as a proving ground for stdlib additions, so we'll have at least one opportunity to make major API changes, at the time we propose these types for addition to the Standard Library.

In the (hopefully unlikely) case that we need to make sweeping API changes before that, we have the option to do that by adding new variants of existing types without removing the original implementations.

What about new additions?

A number of wonderful new data structure implementations are currently getting prepared to land on the main branch -- these will not be part of the 1.0 release, but they will ship in subsequent minor releases (1.1, 1.2, etc.) as soon as they're ready.

I expect new additions will become part of the public API as soon as they get included in a tagged release. (Although this may be considered on a case by case basis.)

Adopting new Swift releases

The Swift language hasn't stopped evolving, and we'd like this package to quickly embrace toolchain improvements that are relevant to its mandate. Accordingly, from time to time, we expect that new versions of this package will require clients to upgrade to the latest Swift toolchain release. (This allows the package to make use of new language/stdlib features, build on compiler bug fixes, and adopt new package manager functionality as soon as they are available.)

Requiring a new Swift release will only require a minor version bump.

The package manager's dependency resolution engine gracefully handles toolchain versioning, so people who are unable to upgrade their Swift toolchain will be able to continue using older package versions that did support them. We also have the option to support these older versions with back ported bug fixes.

Our expectation is that projects that would be interested in adopting new package features are, as a rule, also likely to upgrade to new toolchain releases. (Requiring a recent toolchain also reflects the reality that most engineering work on the package will be concentrated on targeting the latest Swift release (and development snapshots of the next one). Previous toolchain releases will receive limited testing at best.)

Planned API changes before 1.0

We have a list of changes we'd like to make before tagging 1.0, and we're officially soliciting feedback on these and for any other things you'd like to get changed -- please reply to this post with your comments!

The changes below will all be included in an upcoming 0.0.6 tag, including deprecations for APIs we intend to remove.

Packaging changes:

  • Remove the dependency on the Swift Collections Benchmark package by moving the benchmark targets into a separate package, nested in a subdirectory of the same repository. (PR #86)

OrderedSet API updates:

  • Follow the example of the standard Set, and add an index(of:) method, in addition to the firstIndex(of:)/lastIndex(of:) methods we already have. (Issue #88)

OrderedDictionary API updates:

  • Remove subscript(offset:) from the top-level OrderedDictionary type. (Issue #89)

    This subscript uses inconsistent terminology (index vs offset), and it has a well-established, easy to use alternative in the elements view:

    /*  deprecated: */ d[offset: 2] // (key: "two", value: 2)
    /* replacement: */ d.elements[2] // (key: "two", value: 2)
    

    (Originally I wanted to also remove index(forKey:); however, as a result of discussions below, I agreed it makes sense to keep it.)

  • Rename the OrderedDictionary.modifyValue(...) family of methods to OrderedDictionary.updateValue(...), to better match the standard Dictionary.updateValue. (Issue #90)

16 Likes

Isn’t Set’s method deprecated though and only offered for source compatibility? Seems like a strange thing to clone here.

2 Likes

Yeah, index(of:) is the old deprecated spelling of firstIndex(of:) for Collection types. I see @lorentey's point that it might make sense to provide it for types that guarantee element uniqueness (e.g. sets) where first and last index are not great names. However, that should probably be part of a discussion that also includes Set, if it's feasible to provide a non-deprecated implementation of index(of:) on the concrete type.

2 Likes

Ha, well, that's embarrassing -- I forgot Set.index(of:) is deprecated. Introducing OrderedSet.index(of:) isn't as obvious an improvement as I thought it was then. I'll remove that from the document and I'll pretend it never happened. :clown_face: (I think it would make some sense to have it specifically on Set and OrderedSet, as they guarantee that their elements are all unique. But I wouldn't want to spend time on arguing about it -- firstIndex(of:) works fine.)

This does make the suggested replacement for OrderedDictionary.index(forKey:) less palatable though.

/*  deprecated: */ d.index(forKey: "two")
/* replacement: */ d.keys.firstIndex(of: "two") // yuck

If the goal was to be consistent, then the stdlib should have also renamed index(forKey:) to firstIndex(forKey:)/lastIndex(forKey:).

My gut feeling is that we should still go ahead with the removal of both index(forKey:) and the offset subscript. There are some unresolved API design issues here that require a bit more thought. It'll be far easier to add new API after 1.0 than to change/remove it -- if we decide it's the right move, we can always reintroduce one or both of these in a minor version later.

2 Likes

Ignore if this is in the weeds — but Dictionary's index(forKey:) is not deprecated, so having that mirrored on OrderedDictionary too seems fine (assuming OrderedDictionary also ensures key uniqueness, which I believe it does).

The question is though: given that OrderedDictionary isn't a Collection, and it doesn't define an Index type, does it make sense for it to have an index(forKey:) method? What does it mean for a non-collection to have an index?

The method obviously does make sense in the wider context of all the various views of it (that are in fact collections with integer indices), but for OrderedDictionary itself, we haven't really decided if "index" is the right name. (As evidenced by the subscript(offset:) member.)

We do call them "index" in the API docs of remove(at:), swapAt(_:_:) and removeSubrange(_:), but those docs were adapted from the stdlib, and at the time I added them, OrderedDictionary was still a collection (with a custom Index). So the docs are in need of an update anyway.

And now that I look at these with a pedantic eye, remove(at:)/swapAt/removeSubrange don't seem to be good names, either -- the label should indicate the role of the integer parameter, so we should rather call them something like remove(atOffset:)/swapAtOffsets/removeOffsetRange instead. (The only reason the original names are acceptable for, say, Array, is that it comes as part of its RangeReplaceableCollection conformance.)

Still, the easy choice would be to favor consistency (with the views) over pedantry and commit to calling these numbers indices. We could then keep index(forKey:) and only remove subscript(offset:).

Would that sound good?

3 Likes

I just tagged Swift Collections 0.0.7 deprecating the APIs. On reflection, I agreed with @bzamayo that it makes sense to keep OrderedDictionary.index(forKey:), so that method remains fully supported.

There is still time before we tag 1.0 -- please speak up if you'd like to see API changes!

3 Likes

I think it should mirror Dictionary’s API surface as closely as possible, unless there is a good reason not to like with the subscript ambiguity discussion raised on the README, and calling these numbers indices is a decent way to get there.

A PriorityQueue and an IndexedPriorityQueue might be useful data structures.
It would be also good if instead of having their Element be Comparable they’ll take a closure (Element, Element) -> Bool to establish the order of elements.

The 1.0 release will not ship with any new data structures, but we're already planning to ship heaps in a subsequent minor release.

3 Likes

I’m commenting here because of the Deque implementation - I think we had crippling performance issues with at least an early implementation (admittedly this was at about version 0.0.2 from memory) and ended up copying the code into our project, maybe adding something like @useableFromInline to some key methods.

I am on holiday right now so I will have to revisit this on Monday. But just wanted to point this out because I’m not sure whether adding annotations to methods like that is considered source-compatible or not.

It’s quite possible it was a bit of a niche - and more general (in Swift) -
performance issue with a library being used within a library. I will have to ask my colleague when I’m back “online”.

1 Like

A missing @inlinable/@usableFromInline attribute would be a bug we should fix -- luckily, it would not be a source breaking change, so it doesn't need to hold up the 1.0 release.

Deque is almost entirely marked @inlinable, though, so I wonder if there is a more subtle issue. (E.g., was the library code using Deque in a non-specialized generic context? If so, the library code itself probably needs to be marked @inlinable to allow specializations. The current Deque codebase isn't at all optimized for unspecialized use, although we have some (limited) options to improve things later if necessary. (These wouldn't be source-breaking, either.))

2 Likes

Swift Collections 1.0 is out now!

24 Likes