Adding a LinkedList type to the Standard Library

Swift is missing a great many data types (see Adding more data structures to the standard library). I thought I may be able to get this going with suggesting the introduction of a LinkedList type to Swift.

Linked lists have numerous applications in programming. Not only will they make solving certain problems easier, it provides a foundation from which other types can be built, such as queues. Moreover, linked list are more or less a standard collection type in many other languages.

I decided to make LinkedList and doubly linked list. Right now, my current implementation abstracts away the nodes, providing the users with the algorithmic benefits of a linked list while having the same ease-of-use as any other collection.

If you would like to play around with LinkedList, feel free to check out the GitHub repo.

LinkedList:

public struct LinkedList<Element> {
    
    private var headNode: Node?
    private var tailNode: Node?
    
    /// The number of elements in the linked list.
    ///
    /// - Complexity: O(1)
    public private(set) var count: Int = 0
    private var id = ID()
    
    /// Creates a new, empty linked list.
    ///
    /// This is equivalent to initializing a linked list with an empty array literal.
    /// For example:
    ///
    ///     var emptyLinkedList = LinkedList<Int>()
    ///     print(emptyLinkedList.isEmpty)
    ///     // Prints "true"
    ///
    ///     emptyLinkedList = []
    ///     print(emptyLinkedList.isEmpty)
    ///     // Prints "true"
    public init() { }
    
    fileprivate class ID {
        init() { }
    }
}

//MARK: - LinkedList Node
extension LinkedList {
    
    fileprivate class Node {
        public var value: Element
        public var next: Node?
        public weak var previous: Node?
        
        public init(value: Element) {
            self.value = value
        }
    }
    
}

//MARK: - Initializers
public extension LinkedList {
    
    private init(_ nodeChain: (head: Node, tail: Node, count: Int)?) {
        guard let chain = nodeChain else {
            return
        }
        headNode = chain.head
        tailNode = chain.tail
        count = chain.count
    }
    
    init<S>(_ elements: S) where S: Sequence, S.Element == Element {
        if let linkedList = elements as? LinkedList<Element> {
            self = linkedList
        } else {
            self = LinkedList(chain(of: elements))
        }
    }
    
}

//MARK: Chain of Nodes
extension LinkedList {
    /// Creates a chain of nodes from a sequence. Returns `nil` if the sequence is empty.
    private func chain<S>(of sequence: S) -> (head: Node, tail: Node, count: Int)? where S: Sequence, S.Element == Element {
        var iterator = sequence.makeIterator()
        var head, tail: Node
        var count = 0
        guard let firstValue = iterator.next() else {
            return nil
        }
        
        var currentNode = Node(value: firstValue)
        head = currentNode
        count = 1
        
        while let nextElement = iterator.next() {
            let nextNode = Node(value: nextElement)
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            count += 1
        }
        tail = currentNode
        return (head: head, tail: tail, count: count)
    }
}

//MARK: - Copy Nodes
extension LinkedList {
    
    private mutating func copyNodes(settingNodeAt index: Index, to value: Element) {
        id = ID()
        
        var currentIndex = startIndex
        var currentNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
        let newHeadNode = currentNode
        currentIndex = self.index(after: currentIndex)
        
        while currentIndex < endIndex {
            let nextNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            currentIndex = self.index(after: currentIndex)
        }
        headNode = newHeadNode
        tailNode = currentNode
    }
    
    @discardableResult
    private mutating func copyNodes(removing range: Range<Index>) -> Range<Index> {
        
        id = ID()
        var currentIndex = startIndex
        
        while range.contains(currentIndex) {
            currentIndex = index(after: currentIndex)
        }
        
        guard let headValue = currentIndex.node?.value else {
            self = LinkedList()
            return endIndex..<endIndex
        }
        
        var currentNode = Node(value: headValue)
        let newHeadNode = currentNode
        var newCount = 1
        
        var removedRange: Range<Index> = Index(node: currentNode, offset: 0, id: id)..<Index(node: currentNode, offset: 0, id: id)
        currentIndex = index(after: currentIndex)
        
        while currentIndex < endIndex {
            guard !range.contains(currentIndex) else {
                currentIndex = index(after: currentIndex)
                continue
            }
            
            let nextNode = Node(value: currentIndex.node!.value)
            if currentIndex == range.upperBound {
                removedRange = Index(node: nextNode, offset: newCount, id: id)..<Index(node: nextNode, offset: newCount, id: id)
            }
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            newCount += 1
            currentIndex = index(after: currentIndex)
            
        }
        if currentIndex == range.upperBound {
            removedRange = Index(node: nil, offset: newCount, id: id)..<Index(node: nil, offset: newCount, id: id)
        }
        headNode = newHeadNode
        tailNode = currentNode
        count = newCount
        return removedRange
    }
    
}


//MARK: - Computed Properties
public extension LinkedList {
    /// The element at the head (start) of the linked list.
    ///
    /// - Note: This value is the same as `.first`.
    var head: Element? {
        return headNode?.value
    }
    
    /// The element at the tail (end) of the linked list.
    ///
    /// - Note: This value is the same as `.last`.
    var tail: Element? {
        return tailNode?.value
    }
}

//MARK: - Sequence Conformance
extension LinkedList: Sequence {
    
    public typealias Element = Element
    
    public __consuming func makeIterator() -> Iterator {
        return Iterator(node: headNode)
    }
    
    public struct Iterator: IteratorProtocol {
        
        private var currentNode: Node?
        
        fileprivate init(node: Node?) {
            currentNode = node
        }
        
        public mutating func next() -> Element? {
            guard let node = currentNode else {
                return nil
            }
            currentNode = node.next
            return node.value
        }
        
    }
}

//MARK: - Collection Conformance
extension LinkedList: Collection {
    
    public var startIndex: Index {
        return Index(node: headNode, offset: 0, id: id)
    }
    
    public var endIndex: Index {
        return Index(node: nil, offset: count, id: id)
    }
    
    public var first: Element? {
        return head
    }
    
    public var isEmpty: Bool {
        return count == 0
    }
    
    public func index(after i: Index) -> Index {
        precondition(i.isMember(of: self), "LinkedList index is invalid")
        precondition(i.offset != endIndex.offset, "LinkedList index is out of bounds")
        return Index(node: i.node?.next, offset: i.offset + 1, id: id)
    }
    
    public struct Index: Comparable {
        fileprivate weak var node: Node?
        fileprivate var offset: Int
        fileprivate weak var listID: ID?
        
        fileprivate init(node: Node?, offset: Int, id: ID) {
            self.node = node
            self.offset = offset
            self.listID = id
        }
        
        
        /// A Boolean value indicating whether the index is a member of `linkedList`.
        ///
        /// - Parameter linkedList: the list being check for indxe membership.
        fileprivate func isMember(of linkedList: LinkedList) -> Bool {
            return listID === linkedList.id
        }
        
        public static func ==(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset == rhs.offset
        }
        
        public static func <(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset < rhs.offset
        }
    }
    
}


//MARK: - MutableCollection Conformance
extension LinkedList: MutableCollection {
    
    public subscript(position: Index) -> Element {
        get {
            precondition(position.isMember(of: self), "LinkedList index is invalid")
            precondition(position.offset != endIndex.offset, "Index out of range")
            guard let node = position.node else {
                preconditionFailure("LinkedList index is invalid")
            }
            return node.value
        }
        set {
            precondition(position.isMember(of: self), "LinkedList index is invalid")
            precondition(position.offset != endIndex.offset, "Index out of range")
            
            // Copy-on-write semantics for nodes
            if !isKnownUniquelyReferenced(&headNode) {
                copyNodes(settingNodeAt: position, to: newValue)
            } else {
                position.node?.value = newValue
            }
        }
    }
}



//MARK: - LinkedList Specific Operations
public extension LinkedList {
    /// Adds an element to the start of the linked list.
    ///
    /// The following example adds a new number to a linked list of integers:
    ///
    ///     var numbers = [0, 1, 2, 3, 4, 5]
    ///     numbers.prepend(0)
    ///
    ///     print(numbers)
    ///     // Prints "[0, 1, 2, 3, 4, 5]"
    ///
    /// - Parameter newElement: The element to prepend to the collection.
    ///
    /// - Complexity: O(1)
    mutating func prepend(_ newElement: Element) {
        replaceSubrange(startIndex..<startIndex, with: CollectionOfOne(newElement))
    }
    
    /// Adds the elements of a sequence or collection to the start of this
    /// linked list.
    ///
    /// The following example prepends the elements of a `Range<Int>` instance to
    /// a linked list of integers:
    ///
    ///     var numbers: LinkedList = [1, 2, 3, 4, 5]
    ///     numbers.append(contentsOf: -5...0)
    ///     print(numbers)
    ///     // Prints "[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]"
    ///
    /// - Parameter newElements: The elements to prepend to the collection.
    ///
    /// - Complexity: O(*m*), where *m* is the length of `newElements`.
    mutating func prepend<S>(contentsOf newElements: __owned S) where S: Sequence, S.Element == Element {
        replaceSubrange(startIndex..<startIndex, with: newElements)
    }
    
    /// Removes and returns the first element of the collection.
    ///
    /// Calling this method may invalidate all saved indices of this
    /// collection. Do not rely on a previously stored index value after
    /// altering a collection with any operation that can change its length.
    ///
    /// - Returns: The first element of the collection if the collection is not
    /// empty; otherwise, `nil`.
    ///
    /// - Complexity: O(1)
    @discardableResult
    mutating func popFirst() -> Element? {
        if isEmpty {
            return nil
        }
        return removeFirst()
    }
    
    @discardableResult
    mutating func popLast() -> Element? {
        if isEmpty {
            return nil
        }
        return removeLast()
    }
}

//MARK: - BidirectionalCollection Conformance
extension LinkedList: BidirectionalCollection {
    var last: Element? {
        return tail
    }
    
    public func index(before i: Index) -> Index {
        precondition(i.isMember(of: self), "LinkedList index is invalid")
        precondition(i.offset != startIndex.offset, "LinkedList index is out of bounds")
        if i.offset == count {
            return Index(node: tailNode, offset: i.offset - 1, id: id)
        }
        return Index(node: i.node?.previous, offset: i.offset - 1, id: id)
    }
    
}

//MARK: - RangeReplaceableCollection Conformance
extension LinkedList: RangeReplaceableCollection {
    public mutating func append<S>(contentsOf newElements: __owned S) where S: Sequence, Element == S.Element {
        replaceSubrange(endIndex..<endIndex, with: newElements)
    }
    
    public mutating func replaceSubrange<S, R>(_ subrange: R, with newElements: __owned S) where S: Sequence, R: RangeExpression, Element == S.Element, Index == R.Bound {
        
        var range = subrange.relative(to: indices)
        precondition(range.lowerBound.isMember(of: self) && range.upperBound.isMember(of: self), "LinkedList range of indices are invalid")
        precondition(range.lowerBound >= startIndex && range.upperBound <= endIndex, "Subrange bounds are out of range")
        
        // If range covers all elements and the new elements are a LinkedList then set references to it
        if range.lowerBound == startIndex, range.upperBound == endIndex, let linkedList = newElements as? LinkedList {
            self = linkedList
            return
        }
        
        var newElementsCount = 0
        
        // There are no new elements, so range indicates deletion
        guard let nodeChain = chain(of: newElements) else {
            
            // If there is nothing in the removal range
            // This also covers the case that the linked list is empty because this is the only possible range
            guard range.lowerBound != range.upperBound else {
                return
            }
            
            // Deletion range spans all elements
            if range.lowerBound == startIndex && range.upperBound == endIndex {
                headNode = nil
                tailNode = nil
                count = 0
                return
            }
            
            // Copy-on-write semantics for nodes and remove elements in range
            guard isKnownUniquelyReferenced(&headNode) else {
                copyNodes(removing: range)
                return
            }
            
            // Update count after mutation to preserve startIndex and endIndex validity
            defer {
                count = count - (range.upperBound.offset - range.lowerBound.offset)
            }
            
            // Move head up if deletion starts at start index
            if range.lowerBound == startIndex {
                // Can force unwrap node since the upperBound is not the end index
                headNode = range.upperBound.node!
                headNode!.previous = nil
                
            // Move tail back if deletion ends at end index
            } else if range.upperBound == endIndex {
                // Can force unwrap since lowerBound index must have an associated element
                tailNode = range.lowerBound.node!.previous
                tailNode!.next = nil
                
            // Deletion range is in the middle of the linked list
            } else {
                // Can force unwrap all bound nodes since they both must have elements
                range.upperBound.node!.previous = range.lowerBound.node!.previous
                range.lowerBound.node!.previous!.next = range.upperBound.node!
            }
            
            return
        }
        
        // Obtain the count of the new elements from the node chain composed from them
        newElementsCount = nodeChain.count
        
        // Replace entire content of list with new elements
        if range.lowerBound == startIndex && range.upperBound == endIndex {
            headNode = nodeChain.head
            tailNode = nodeChain.tail
            count = nodeChain.count
            return
        }
        
        // Copy-on-write semantics for nodes before mutation
        if !isKnownUniquelyReferenced(&headNode) {
            range = copyNodes(removing: range)
        }
        
        // Update count after mutation to preserve startIndex and endIndex validity
        defer {
            count += nodeChain.count - (range.upperBound.offset - range.lowerBound.offset)
        }
        
        // Prepending new elements
        guard range.upperBound != startIndex else {
            headNode?.previous = nodeChain.tail
            nodeChain.tail.next = headNode
            headNode = nodeChain.head
            return
        }
        
        // Appending new elements
        guard range.lowerBound != endIndex else {
            tailNode?.next = nodeChain.head
            nodeChain.head.previous = tailNode
            tailNode = nodeChain.tail
            return
        }
        
        if range.lowerBound == startIndex {
            headNode = nodeChain.head
        }
        if range.upperBound == endIndex {
            tailNode = nodeChain.tail
        }
        
        range.lowerBound.node!.previous!.next = nodeChain.head
        range.upperBound.node!.previous = nodeChain.tail
    }
    
}

//MARK: - ExpressibleByArrayLiteral Conformance
extension LinkedList: ExpressibleByArrayLiteral {
    public typealias ArrayLiteralElement = Element
    
    public init(arrayLiteral elements: ArrayLiteralElement...) {
        self.init(elements)
    }
}

//MARK: - CustomStringConvertible Conformance
extension LinkedList: CustomStringConvertible {
    public var description: String {
        return "[" + lazy.map { "\($0)" }.joined(separator: ", ") + "]"
    }
}

//MARK: - Equatable Conformance
extension LinkedList: Equatable where Element: Equatable {
    public static func ==(lhs: LinkedList<Element>, rhs: LinkedList<Element>) -> Bool {
        guard lhs.count == rhs.count else {
            return false
        }
        for (a, b) in zip(lhs, rhs) {
            guard a == b else {
                return false
            }
        }
        return true
    }
}

Is this a worthwhile addition to Swift? Are there any aspects of it that should be changed?

3 Likes

On a personal level, I think I'd oppose adding LinkedList to the standard library because I think it has exceedingly limited utility.

While you're right that linked lists are a fairly standard data structure, in practice most of that standardisation comes from their prevalence in CS and algorithmics classes. Linked lists have useful properties that make them helpful for framing discussions about algorithmic complexity, and they are straightforward enough to understand that they are a great building block for early data structures work. However, in practical design I can think of very few cases where linked lists are useful.

The standard comparison for a linked list is to an Array. There are many things linked lists are bad at that arrays are good at, mostly because arbitrary element lookup in a linked list is O(n), whereas in arrays it is O(1). This eliminates a wide range of useful algorithms: for example, large linked lists are very unpleasant to search as they cannot be efficiently partitioned.

Worse, however, even iterating a linked list linearly is slower than doing so on an array due to the cache-unfriendliness of a linked list. A linked list is a page-fault generation machine, as it's extremely common for list nodes to be allocated in unrelated pages of memory, whereas arrays have excellent cache coherency properties. This means that even the "performant" operations on lists (walking forwards and backwards) are slower than they would be on an array.

There is exactly one use-case where a linked list may be preferable to an array, which is building a FIFO. In this context the algorithms will append to the back and pop from the front. However, even here I think it's hard to make the case.

This is because you can use Array to build a perfectly serviceable circular buffer, and a circular buffer is a great implementation of a deque. SwiftNIO has just such an implementation that is used extensively throughout the codebase. It has all the benefits of Array plus the benefits of a list: appending and removing elements from the front or back is an amortised O(1) operation, so it's trivial to add and remove elements.

The only reason to prefer a list to an array here that I can see is in cases where you believe a large slab allocation cannot safely be made, or when you think you may need to build this queue to a very large size and you bear the copying costs associated with resizing. In cases like this SwiftNIO frequently reserves capacity in the circular buffer for something like 32 or 64 elements, avoiding the first several resizing operations.

The end result of all of this is that while a linked list is an important academic data structure, I think it is a very poor choice for almost all use-cases in practical programming, where arrays should almost always be preferred. For this reason I don't think the Swift standard library should ship it: instead, we should be encouraging the Swift standard library to build the solutions where a programmer may want a list, but would be better served by an abstraction over an array. The obvious starting place for my money is a deque, which I believe @lorentey has opinions on.

38 Likes

Although I agree with most of what you said, keep in mind that Array or any abstractions of Array such as Deque support these operations in amortized O(1) and not O(1) as LinkedList would. For apps this probably isn't a big deal, but it may prevent people from using these collections in other areas, such as real time systems.

1 Like

Yup, I totally grant that. My contention is that the attractive nuisance of providing the linked list causes a higher cost on the ecosystem than it provides in value to the cases that need it. If you're working in a hard real time system, you're sufficiently far from the mainstream use cases of most languages and libraries that you're essentially in a parallel ecosystem anyway.

1 Like

I also tried to write a linked list with value semantics recently, and I couldn't do it. The moment you reference concrete nodes in your indices, it breaks down once you consider generic algorithms over Collection during which the collection needs to be copied due to COW.

Take this sorting algorithm, for instance:

extension MutableCollection where Element: Comparable  {
    mutating func selectionSort() {
        var indexOfFirstUnsortedElement = self.startIndex

        while indexOfFirstUnsortedElement < self.endIndex {
            var indexOfSmallestUnsortedElement = indexOfFirstUnsortedElement
            var currentIndex = self.index(after: indexOfFirstUnsortedElement)

            while currentIndex < self.endIndex {
                if self[currentIndex] < self[indexOfSmallestUnsortedElement] {
                    indexOfSmallestUnsortedElement = currentIndex
                }

                self.formIndex(after: &currentIndex)
            }

            self.swapAt(indexOfFirstUnsortedElement, indexOfSmallestUnsortedElement)
            self.formIndex(after: &indexOfFirstUnsortedElement)
        }
    }
}

I actually expected this to crash, considering you verify the list IDs when accessing the subscript, but it simply returns an incorrect result. I haven't fully understood your implementation yet:

var list: LinkedList<Int> = [8, 5, 2, 9, 5, 6, 3]
let copy = list
list.selectionSort2()
print(list) // [2, 5, 2, 9, 5, 6, 3]

I'm not sure if it is possible to implement a value-type linked list properly for which such generic algorithms continue working.

I just fixed a bug wherein the id of the linked list is not updated on mutation. Now the code crashes with the error, LinkedList index is invalid. Currently, this code does not work because technically we are taking the index before the list is mutated. This creates an inconsistency when mutating the list later in the method as upon the initial mutation, indexOfFirstUnsortedElement is invalidated.

The tricky thing about copy-on-write semantics with LinkedList is that id is not updated upon its value being copied, in your example @jenox:

let copy = list

Because of copy-on-write, since neither of the two has been mutated, the id, any indices taken from list or copy before they have been mutated will be invalidated upon their mutation.

I am currently at work on a way to fix the problem.

@jenox, I think you are right. Unfortunately, I don't think it is possible to write these algorithms from a generic context. I've been trying for a while, but can't think of a way that this can be done. Do you have any ideas as to what can be done to remedy this?

I agree with @lukasa that linked lists, as a generic container data structure, are of limited utility. In most situations where a linked list is practically necessary, it's generally an intrusive linked list, where the data structure is an embedded part of the structures being linked together. Instead of trying to implement LinkedList<T> as a generic collection like Array, it might be interesting to consider what it'd look like as a protocol for types that form intrusive linked lists. As a protocol, you could let intrusive list types specify the minimal list contract (how to get to the previous and next nodes, if any), and build generic convenience traversal, insertion, and removal operations on top of that protocol.

26 Likes

I haven't found a solution to this problem when I tried building a linked list myself, other than inserting a dummy "copy if needed" operation before getting any indices in genetic algorithms — but that defeats the purpose in my opinion.

1 Like

I kicked this idea around a while back, and bumped on the problem of trying to have an entry potentially be part of multiple intrusive lists at once (that is, to have multiple list node structures in one parent structure). Presumably this is a fundamental limitation of such a model?

2 Likes

Good point. Without surfacing language support for multiple conformances, maybe you could use view adapters, similar to how String surfaces utf8, utf16, etc. collection views?

1 Like

Yes, I think that works.

Is there any specific reason to avoid multiple conformances, or is it only a lot of work on the compiler side? I would love to have something like Haskell's newtype to create similar types with different conformances...

Swift's underlying type system and runtime model allows for multiple conformances (with some exceptions, such as as? Protocol, where it isn't clear what conformance should be chosen if there are multiple candidates), but we've thus far avoided exposing the ability in the language because it raises a lot of design issues, and could be very confusing if exposed in the wrong way (and also, because we would have to solve the problem for the aforementioned exceptions). For most intents and purposes, a single-field struct or single-case enum is effectively a newtype in Swift already.

4 Likes

There is a big caveat with this, though - exposing view adapters (single-field structs which expose a different version of a conformance, like String/String.UTF8View) doesn't play well with copy-on-write. The view adapter ends up holding an additional reference to the underlying storage, meaning mutations will trigger a copy.

Technically it is correct and preserves value semantics to do that, but it would be great if we had a way of creating an inout version of a ViewAdapter, containing a reference to the underlying collection rather than a copy. Is that something ownership could help with? Or would we need something like _modify for computed properties?

3 Likes

I agree that's something we need. In theory, it should be possible for a view adapter to be borrowed in-place over the underlying value; since the types are layout-compatible, they can share the same memory, and you should be able to shared-borrow or exclusive-borrow the wrapper over the wrapped value. If we had something like unsafeBitCast that worked as a read and/or modify coroutine, that could do the job.

5 Likes

After fixing a bog in the replaceSubrange where the new chain is not linked in properly, I ran into this generic algorithm problem. I made a change that causes the sort example to now work correctly, but I have not tested more extensively to see if that solves everything.
What I did was to write a private function that returns a node for a given index: stored node if membership is proper, linearly skipping to it otherwise:

private func newNode(for index: Index) -> Node? {
	guard !index.isMember(of: self) else {
		return index.node
	}
	precondition(endIndex.offset - index.offset > startIndex.offset)
	// Search from start or end
	var currentNode: Node?
	if index.offset <= count / 2 {
		currentNode = firstNode
		var i = startIndex.offset
		while i < index.offset {
			i += 1
			currentNode = currentNode?.next
		}
	} else {
		currentNode = lastNode
		var i = endIndex.offset
		while index.offset < i {
			i -= 1
			currentNode = currentNode?.previous
		}
	}
	guard let result = currentNode else {
		preconditionFailure("Unexpected nil")
	}
	return result
}

with that in place, replace constructs like this:

precondition(i.isMember(of: self), "LinkedList index is invalid")
precondition(i.offset != endIndex.offset, "LinkedList index is out of bounds")
return Index(node: i.node?.next, offset: i.offset + 1, id: id)

with

guard let node = newNode(for: index) else {
    preconditionFailure("Node for index not found")
}
precondition(index.offset != endIndex.offset, "LinkedList index is out of bounds")
return Index(node:node.next, offset: index.offset + 1, id: id)

So basically, if an index seems to have a node belonging to a list with another id, go find the node, at the same offset, in the current list. This of course happens at O(N/2) cost, but allows proceeding.

Such indexes exist because they are value types and were formed before a COW operation happened and still point to the old nodes in the original list, but also still contain those offsets. Therefore this "fix" will only work if offsets themselves have not (yet) been invalidated. If elements were deleted or added to the new list after the COW, but before use of the index, there will still be a problem. This seems to be a common problem with indexes into MutableCollection anyway.

At this time I don't think linked lists belong in the Standard Library. They are almost always the wrong data structure to use in practice, so having them prominently available would probably be a bad idea. However, they do have two properties that I believe can make them very useful in niche use cases: index stability and "constant-time" insertions/removals. If some of these use cases turn out to be more important than I thought, then maybe I could change my mind, though!

By index stability, I expect that an index pointing to an element in a linked list will remain valid (pointing to the same item) until the item is removed, even if other items are added or removed from the list, before or after the indexed item.

In the cases I've come across so far, it was always the case that index stability could be substituted by content/id based lookups in a dictionary-like data structure. For example, keys in a Dictionary/OrderedDictionary behave somewhat like stable indices. But I can easily believe that it isn't always practical / convenient to do this.

Index stability comes easy* for linked lists, but it is tricky to implement in other kinds of data structures.

* (Nothing is ever really easy. The straightforward implementation of linked lists results in an unsatisfying, slow Comparable implementation; it's unclear to me if this can be fixed.)

The other major benefit of linked lists is "constant-time" insertions/removals at a particular index. This too can often be well enough approximated (e.g. by using a tree-based data structure), although this usually breaks index stability.

These two features can sometimes be important, so I wouldn't want to entirely dismiss the idea of implementing a linked list. But the use cases that require these features seem to be so rare that I don't expect we'd ever want to add a linked list type to the Standard Library -- and certainly not before adding some more practical data structures that can often replace them (such simple deques or a B-tree based list construct).


The best place to develop a linked list implementation is the same as with other data structures: in a package! Once the kinks have been hammered out, and folks have gained years of production experience using it, then it may be worth considering adopting it in the Standard Library -- it will certainly be easier to argue for the addition if we can be reasonably confident that the implementation is the right one.

(Generic data structure implementations usually need to be inlinable to achieve good performance. This (along with the stdlib's ABI stability requirements) usually makes it difficult to change/improve data structure implementations once they have shipped in a Swift release. We need to get them right the first time, because we usually won't be able to change the memory layout or overhaul the API later. This increases the stakes considerably -- there are few things sadder than finding out too late that we shipped something that is "obviously" suboptimal, but not being able to fix it.)

Therefore, unless something forces our hands*, I would strongly oppose adding a new general-purpose data structure to the Standard Library unless its implementation originates from a long standing, universally beloved package that has seen of years of widespread production use, and there is already clear consensus about the new type's (source and binary) interfaces.

* (Such as the new data structure being intimately tied to a high-profile language feature under development.)

For example, I think Swift Collections is a great place for developing new data structures that might end up in/near the Standard Library. (Landing a new data structure there still isn't easy, though -- it takes far less work than landing something in the stdlib, but the bar is still very, very high. For reference, we have several new data structures nearing completion, but the package is yet to release its first feature update.)

The quickest way to ship a data structure is to release your own package, like you already did. :+1:

(Tip: a general purpose linked list implementation needs to support an arbitrary number of items. E.g., destroying a long linked list must not overflow the stack.)

8 Likes

To properly implement copy-on-write value semantics, linked lists need to make a copy their entire contents* whenever a shared list is mutated. If the list nodes are regular class instances, then the new copy changes object identities of the nodes, invalidating any existing indices that reference them -- which is a nono, as such invalidation would e.g. violate MutableCollection requirements, and cancel out the promise of index stability.

* (A forward-only linked list would only need to make a copy of all nodes up to the mutation point.)

The easy (but very unpleasant) way out is for the list to implement reference semantics instead. Of course, this makes the list tricky to use in places that require Sendable conformance.

(Linked lists may be good candidates to be move-only types, once those become a thing.)

4 Likes

I really don't get the arguing against basic data structures in a standard library, like OrderedSet, OrderedDictionaries, Lists, Deque etc... what are you afraid of? That you add a few bytes to the standard library?

There is literally no downside to having a standard library that has many basic data structures.

Why force everbody to write everything from scratch all the time?

Swift is not so swift after all...