Adding a LinkedList type to the Standard Library

Swift is missing a great many data types (see Adding more data structures to the standard library). I thought I may be able to get this going with suggesting the introduction of a LinkedList type to Swift.

Linked lists have numerous applications in programming. Not only will they make solving certain problems easier, it provides a foundation from which other types can be built, such as queues. Moreover, linked list are more or less a standard collection type in many other languages.

I decided to make LinkedList and doubly linked list. Right now, my current implementation abstracts away the nodes, providing the users with the algorithmic benefits of a linked list while having the same ease-of-use as any other collection.

If you would like to play around with LinkedList, feel free to check out the GitHub repo.

LinkedList:

public struct LinkedList<Element> {
    
    private var headNode: Node?
    private var tailNode: Node?
    
    /// The number of elements in the linked list.
    ///
    /// - Complexity: O(1)
    public private(set) var count: Int = 0
    private var id = ID()
    
    /// Creates a new, empty linked list.
    ///
    /// This is equivalent to initializing a linked list with an empty array literal.
    /// For example:
    ///
    ///     var emptyLinkedList = LinkedList<Int>()
    ///     print(emptyLinkedList.isEmpty)
    ///     // Prints "true"
    ///
    ///     emptyLinkedList = []
    ///     print(emptyLinkedList.isEmpty)
    ///     // Prints "true"
    public init() { }
    
    fileprivate class ID {
        init() { }
    }
}

//MARK: - LinkedList Node
extension LinkedList {
    
    fileprivate class Node {
        public var value: Element
        public var next: Node?
        public weak var previous: Node?
        
        public init(value: Element) {
            self.value = value
        }
    }
    
}

//MARK: - Initializers
public extension LinkedList {
    
    private init(_ nodeChain: (head: Node, tail: Node, count: Int)?) {
        guard let chain = nodeChain else {
            return
        }
        headNode = chain.head
        tailNode = chain.tail
        count = chain.count
    }
    
    init<S>(_ elements: S) where S: Sequence, S.Element == Element {
        if let linkedList = elements as? LinkedList<Element> {
            self = linkedList
        } else {
            self = LinkedList(chain(of: elements))
        }
    }
    
}

//MARK: Chain of Nodes
extension LinkedList {
    /// Creates a chain of nodes from a sequence. Returns `nil` if the sequence is empty.
    private func chain<S>(of sequence: S) -> (head: Node, tail: Node, count: Int)? where S: Sequence, S.Element == Element {
        var iterator = sequence.makeIterator()
        var head, tail: Node
        var count = 0
        guard let firstValue = iterator.next() else {
            return nil
        }
        
        var currentNode = Node(value: firstValue)
        head = currentNode
        count = 1
        
        while let nextElement = iterator.next() {
            let nextNode = Node(value: nextElement)
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            count += 1
        }
        tail = currentNode
        return (head: head, tail: tail, count: count)
    }
}

//MARK: - Copy Nodes
extension LinkedList {
    
    private mutating func copyNodes(settingNodeAt index: Index, to value: Element) {
        id = ID()
        
        var currentIndex = startIndex
        var currentNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
        let newHeadNode = currentNode
        currentIndex = self.index(after: currentIndex)
        
        while currentIndex < endIndex {
            let nextNode = Node(value: currentIndex == index ? value : currentIndex.node!.value)
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            currentIndex = self.index(after: currentIndex)
        }
        headNode = newHeadNode
        tailNode = currentNode
    }
    
    @discardableResult
    private mutating func copyNodes(removing range: Range<Index>) -> Range<Index> {
        
        id = ID()
        var currentIndex = startIndex
        
        while range.contains(currentIndex) {
            currentIndex = index(after: currentIndex)
        }
        
        guard let headValue = currentIndex.node?.value else {
            self = LinkedList()
            return endIndex..<endIndex
        }
        
        var currentNode = Node(value: headValue)
        let newHeadNode = currentNode
        var newCount = 1
        
        var removedRange: Range<Index> = Index(node: currentNode, offset: 0, id: id)..<Index(node: currentNode, offset: 0, id: id)
        currentIndex = index(after: currentIndex)
        
        while currentIndex < endIndex {
            guard !range.contains(currentIndex) else {
                currentIndex = index(after: currentIndex)
                continue
            }
            
            let nextNode = Node(value: currentIndex.node!.value)
            if currentIndex == range.upperBound {
                removedRange = Index(node: nextNode, offset: newCount, id: id)..<Index(node: nextNode, offset: newCount, id: id)
            }
            currentNode.next = nextNode
            nextNode.previous = currentNode
            currentNode = nextNode
            newCount += 1
            currentIndex = index(after: currentIndex)
            
        }
        if currentIndex == range.upperBound {
            removedRange = Index(node: nil, offset: newCount, id: id)..<Index(node: nil, offset: newCount, id: id)
        }
        headNode = newHeadNode
        tailNode = currentNode
        count = newCount
        return removedRange
    }
    
}


//MARK: - Computed Properties
public extension LinkedList {
    /// The element at the head (start) of the linked list.
    ///
    /// - Note: This value is the same as `.first`.
    var head: Element? {
        return headNode?.value
    }
    
    /// The element at the tail (end) of the linked list.
    ///
    /// - Note: This value is the same as `.last`.
    var tail: Element? {
        return tailNode?.value
    }
}

//MARK: - Sequence Conformance
extension LinkedList: Sequence {
    
    public typealias Element = Element
    
    public __consuming func makeIterator() -> Iterator {
        return Iterator(node: headNode)
    }
    
    public struct Iterator: IteratorProtocol {
        
        private var currentNode: Node?
        
        fileprivate init(node: Node?) {
            currentNode = node
        }
        
        public mutating func next() -> Element? {
            guard let node = currentNode else {
                return nil
            }
            currentNode = node.next
            return node.value
        }
        
    }
}

//MARK: - Collection Conformance
extension LinkedList: Collection {
    
    public var startIndex: Index {
        return Index(node: headNode, offset: 0, id: id)
    }
    
    public var endIndex: Index {
        return Index(node: nil, offset: count, id: id)
    }
    
    public var first: Element? {
        return head
    }
    
    public var isEmpty: Bool {
        return count == 0
    }
    
    public func index(after i: Index) -> Index {
        precondition(i.isMember(of: self), "LinkedList index is invalid")
        precondition(i.offset != endIndex.offset, "LinkedList index is out of bounds")
        return Index(node: i.node?.next, offset: i.offset + 1, id: id)
    }
    
    public struct Index: Comparable {
        fileprivate weak var node: Node?
        fileprivate var offset: Int
        fileprivate weak var listID: ID?
        
        fileprivate init(node: Node?, offset: Int, id: ID) {
            self.node = node
            self.offset = offset
            self.listID = id
        }
        
        
        /// A Boolean value indicating whether the index is a member of `linkedList`.
        ///
        /// - Parameter linkedList: the list being check for indxe membership.
        fileprivate func isMember(of linkedList: LinkedList) -> Bool {
            return listID === linkedList.id
        }
        
        public static func ==(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset == rhs.offset
        }
        
        public static func <(lhs: Index, rhs: Index) -> Bool {
            return lhs.offset < rhs.offset
        }
    }
    
}


//MARK: - MutableCollection Conformance
extension LinkedList: MutableCollection {
    
    public subscript(position: Index) -> Element {
        get {
            precondition(position.isMember(of: self), "LinkedList index is invalid")
            precondition(position.offset != endIndex.offset, "Index out of range")
            guard let node = position.node else {
                preconditionFailure("LinkedList index is invalid")
            }
            return node.value
        }
        set {
            precondition(position.isMember(of: self), "LinkedList index is invalid")
            precondition(position.offset != endIndex.offset, "Index out of range")
            
            // Copy-on-write semantics for nodes
            if !isKnownUniquelyReferenced(&headNode) {
                copyNodes(settingNodeAt: position, to: newValue)
            } else {
                position.node?.value = newValue
            }
        }
    }
}



//MARK: - LinkedList Specific Operations
public extension LinkedList {
    /// Adds an element to the start of the linked list.
    ///
    /// The following example adds a new number to a linked list of integers:
    ///
    ///     var numbers = [0, 1, 2, 3, 4, 5]
    ///     numbers.prepend(0)
    ///
    ///     print(numbers)
    ///     // Prints "[0, 1, 2, 3, 4, 5]"
    ///
    /// - Parameter newElement: The element to prepend to the collection.
    ///
    /// - Complexity: O(1)
    mutating func prepend(_ newElement: Element) {
        replaceSubrange(startIndex..<startIndex, with: CollectionOfOne(newElement))
    }
    
    /// Adds the elements of a sequence or collection to the start of this
    /// linked list.
    ///
    /// The following example prepends the elements of a `Range<Int>` instance to
    /// a linked list of integers:
    ///
    ///     var numbers: LinkedList = [1, 2, 3, 4, 5]
    ///     numbers.append(contentsOf: -5...0)
    ///     print(numbers)
    ///     // Prints "[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]"
    ///
    /// - Parameter newElements: The elements to prepend to the collection.
    ///
    /// - Complexity: O(*m*), where *m* is the length of `newElements`.
    mutating func prepend<S>(contentsOf newElements: __owned S) where S: Sequence, S.Element == Element {
        replaceSubrange(startIndex..<startIndex, with: newElements)
    }
    
    /// Removes and returns the first element of the collection.
    ///
    /// Calling this method may invalidate all saved indices of this
    /// collection. Do not rely on a previously stored index value after
    /// altering a collection with any operation that can change its length.
    ///
    /// - Returns: The first element of the collection if the collection is not
    /// empty; otherwise, `nil`.
    ///
    /// - Complexity: O(1)
    @discardableResult
    mutating func popFirst() -> Element? {
        if isEmpty {
            return nil
        }
        return removeFirst()
    }
    
    @discardableResult
    mutating func popLast() -> Element? {
        if isEmpty {
            return nil
        }
        return removeLast()
    }
}

//MARK: - BidirectionalCollection Conformance
extension LinkedList: BidirectionalCollection {
    var last: Element? {
        return tail
    }
    
    public func index(before i: Index) -> Index {
        precondition(i.isMember(of: self), "LinkedList index is invalid")
        precondition(i.offset != startIndex.offset, "LinkedList index is out of bounds")
        if i.offset == count {
            return Index(node: tailNode, offset: i.offset - 1, id: id)
        }
        return Index(node: i.node?.previous, offset: i.offset - 1, id: id)
    }
    
}

//MARK: - RangeReplaceableCollection Conformance
extension LinkedList: RangeReplaceableCollection {
    public mutating func append<S>(contentsOf newElements: __owned S) where S: Sequence, Element == S.Element {
        replaceSubrange(endIndex..<endIndex, with: newElements)
    }
    
    public mutating func replaceSubrange<S, R>(_ subrange: R, with newElements: __owned S) where S: Sequence, R: RangeExpression, Element == S.Element, Index == R.Bound {
        
        var range = subrange.relative(to: indices)
        precondition(range.lowerBound.isMember(of: self) && range.upperBound.isMember(of: self), "LinkedList range of indices are invalid")
        precondition(range.lowerBound >= startIndex && range.upperBound <= endIndex, "Subrange bounds are out of range")
        
        // If range covers all elements and the new elements are a LinkedList then set references to it
        if range.lowerBound == startIndex, range.upperBound == endIndex, let linkedList = newElements as? LinkedList {
            self = linkedList
            return
        }
        
        var newElementsCount = 0
        
        // There are no new elements, so range indicates deletion
        guard let nodeChain = chain(of: newElements) else {
            
            // If there is nothing in the removal range
            // This also covers the case that the linked list is empty because this is the only possible range
            guard range.lowerBound != range.upperBound else {
                return
            }
            
            // Deletion range spans all elements
            if range.lowerBound == startIndex && range.upperBound == endIndex {
                headNode = nil
                tailNode = nil
                count = 0
                return
            }
            
            // Copy-on-write semantics for nodes and remove elements in range
            guard isKnownUniquelyReferenced(&headNode) else {
                copyNodes(removing: range)
                return
            }
            
            // Update count after mutation to preserve startIndex and endIndex validity
            defer {
                count = count - (range.upperBound.offset - range.lowerBound.offset)
            }
            
            // Move head up if deletion starts at start index
            if range.lowerBound == startIndex {
                // Can force unwrap node since the upperBound is not the end index
                headNode = range.upperBound.node!
                headNode!.previous = nil
                
            // Move tail back if deletion ends at end index
            } else if range.upperBound == endIndex {
                // Can force unwrap since lowerBound index must have an associated element
                tailNode = range.lowerBound.node!.previous
                tailNode!.next = nil
                
            // Deletion range is in the middle of the linked list
            } else {
                // Can force unwrap all bound nodes since they both must have elements
                range.upperBound.node!.previous = range.lowerBound.node!.previous
                range.lowerBound.node!.previous!.next = range.upperBound.node!
            }
            
            return
        }
        
        // Obtain the count of the new elements from the node chain composed from them
        newElementsCount = nodeChain.count
        
        // Replace entire content of list with new elements
        if range.lowerBound == startIndex && range.upperBound == endIndex {
            headNode = nodeChain.head
            tailNode = nodeChain.tail
            count = nodeChain.count
            return
        }
        
        // Copy-on-write semantics for nodes before mutation
        if !isKnownUniquelyReferenced(&headNode) {
            range = copyNodes(removing: range)
        }
        
        // Update count after mutation to preserve startIndex and endIndex validity
        defer {
            count += nodeChain.count - (range.upperBound.offset - range.lowerBound.offset)
        }
        
        // Prepending new elements
        guard range.upperBound != startIndex else {
            headNode?.previous = nodeChain.tail
            nodeChain.tail.next = headNode
            headNode = nodeChain.head
            return
        }
        
        // Appending new elements
        guard range.lowerBound != endIndex else {
            tailNode?.next = nodeChain.head
            nodeChain.head.previous = tailNode
            tailNode = nodeChain.tail
            return
        }
        
        if range.lowerBound == startIndex {
            headNode = nodeChain.head
        }
        if range.upperBound == endIndex {
            tailNode = nodeChain.tail
        }
        
        range.lowerBound.node!.previous!.next = nodeChain.head
        range.upperBound.node!.previous = nodeChain.tail
    }
    
}

//MARK: - ExpressibleByArrayLiteral Conformance
extension LinkedList: ExpressibleByArrayLiteral {
    public typealias ArrayLiteralElement = Element
    
    public init(arrayLiteral elements: ArrayLiteralElement...) {
        self.init(elements)
    }
}

//MARK: - CustomStringConvertible Conformance
extension LinkedList: CustomStringConvertible {
    public var description: String {
        return "[" + lazy.map { "\($0)" }.joined(separator: ", ") + "]"
    }
}

//MARK: - Equatable Conformance
extension LinkedList: Equatable where Element: Equatable {
    public static func ==(lhs: LinkedList<Element>, rhs: LinkedList<Element>) -> Bool {
        guard lhs.count == rhs.count else {
            return false
        }
        for (a, b) in zip(lhs, rhs) {
            guard a == b else {
                return false
            }
        }
        return true
    }
}

Is this a worthwhile addition to Swift? Are there any aspects of it that should be changed?

1 Like

On a personal level, I think I'd oppose adding LinkedList to the standard library because I think it has exceedingly limited utility.

While you're right that linked lists are a fairly standard data structure, in practice most of that standardisation comes from their prevalence in CS and algorithmics classes. Linked lists have useful properties that make them helpful for framing discussions about algorithmic complexity, and they are straightforward enough to understand that they are a great building block for early data structures work. However, in practical design I can think of very few cases where linked lists are useful.

The standard comparison for a linked list is to an Array. There are many things linked lists are bad at that arrays are good at, mostly because arbitrary element lookup in a linked list is O(n), whereas in arrays it is O(1). This eliminates a wide range of useful algorithms: for example, large linked lists are very unpleasant to search as they cannot be efficiently partitioned.

Worse, however, even iterating a linked list linearly is slower than doing so on an array due to the cache-unfriendliness of a linked list. A linked list is a page-fault generation machine, as it's extremely common for list nodes to be allocated in unrelated pages of memory, whereas arrays have excellent cache coherency properties. This means that even the "performant" operations on lists (walking forwards and backwards) are slower than they would be on an array.

There is exactly one use-case where a linked list may be preferable to an array, which is building a FIFO. In this context the algorithms will append to the back and pop from the front. However, even here I think it's hard to make the case.

This is because you can use Array to build a perfectly serviceable circular buffer, and a circular buffer is a great implementation of a deque. SwiftNIO has just such an implementation that is used extensively throughout the codebase. It has all the benefits of Array plus the benefits of a list: appending and removing elements from the front or back is an amortised O(1) operation, so it's trivial to add and remove elements.

The only reason to prefer a list to an array here that I can see is in cases where you believe a large slab allocation cannot safely be made, or when you think you may need to build this queue to a very large size and you bear the copying costs associated with resizing. In cases like this SwiftNIO frequently reserves capacity in the circular buffer for something like 32 or 64 elements, avoiding the first several resizing operations.

The end result of all of this is that while a linked list is an important academic data structure, I think it is a very poor choice for almost all use-cases in practical programming, where arrays should almost always be preferred. For this reason I don't think the Swift standard library should ship it: instead, we should be encouraging the Swift standard library to build the solutions where a programmer may want a list, but would be better served by an abstraction over an array. The obvious starting place for my money is a deque, which I believe @lorentey has opinions on.

25 Likes

Although I agree with most of what you said, keep in mind that Array or any abstractions of Array such as Deque support these operations in amortized O(1) and not O(1) as LinkedList would. For apps this probably isn't a big deal, but it may prevent people from using these collections in other areas, such as real time systems.

1 Like

Yup, I totally grant that. My contention is that the attractive nuisance of providing the linked list causes a higher cost on the ecosystem than it provides in value to the cases that need it. If you're working in a hard real time system, you're sufficiently far from the mainstream use cases of most languages and libraries that you're essentially in a parallel ecosystem anyway.

I also tried to write a linked list with value semantics recently, and I couldn't do it. The moment you reference concrete nodes in your indices, it breaks down once you consider generic algorithms over Collection during which the collection needs to be copied due to COW.

Take this sorting algorithm, for instance:

extension MutableCollection where Element: Comparable  {
    mutating func selectionSort() {
        var indexOfFirstUnsortedElement = self.startIndex

        while indexOfFirstUnsortedElement < self.endIndex {
            var indexOfSmallestUnsortedElement = indexOfFirstUnsortedElement
            var currentIndex = self.index(after: indexOfFirstUnsortedElement)

            while currentIndex < self.endIndex {
                if self[currentIndex] < self[indexOfSmallestUnsortedElement] {
                    indexOfSmallestUnsortedElement = currentIndex
                }

                self.formIndex(after: &currentIndex)
            }

            self.swapAt(indexOfFirstUnsortedElement, indexOfSmallestUnsortedElement)
            self.formIndex(after: &indexOfFirstUnsortedElement)
        }
    }
}

I actually expected this to crash, considering you verify the list IDs when accessing the subscript, but it simply returns an incorrect result. I haven't fully understood your implementation yet:

var list: LinkedList<Int> = [8, 5, 2, 9, 5, 6, 3]
let copy = list
list.selectionSort2()
print(list) // [2, 5, 2, 9, 5, 6, 3]

I'm not sure if it is possible to implement a value-type linked list properly for which such generic algorithms continue working.

I just fixed a bug wherein the id of the linked list is not updated on mutation. Now the code crashes with the error, LinkedList index is invalid. Currently, this code does not work because technically we are taking the index before the list is mutated. This creates an inconsistency when mutating the list later in the method as upon the initial mutation, indexOfFirstUnsortedElement is invalidated.

The tricky thing about copy-on-write semantics with LinkedList is that id is not updated upon its value being copied, in your example @jenox:

let copy = list

Because of copy-on-write, since neither of the two has been mutated, the id, any indices taken from list or copy before they have been mutated will be invalidated upon their mutation.

I am currently at work on a way to fix the problem.

@jenox, I think you are right. Unfortunately, I don't think it is possible to write these algorithms from a generic context. I've been trying for a while, but can't think of a way that this can be done. Do you have any ideas as to what can be done to remedy this?

I agree with @lukasa that linked lists, as a generic container data structure, are of limited utility. In most situations where a linked list is practically necessary, it's generally an intrusive linked list, where the data structure is an embedded part of the structures being linked together. Instead of trying to implement LinkedList<T> as a generic collection like Array, it might be interesting to consider what it'd look like as a protocol for types that form intrusive linked lists. As a protocol, you could let intrusive list types specify the minimal list contract (how to get to the previous and next nodes, if any), and build generic convenience traversal, insertion, and removal operations on top of that protocol.

18 Likes

I haven't found a solution to this problem when I tried building a linked list myself, other than inserting a dummy "copy if needed" operation before getting any indices in genetic algorithms — but that defeats the purpose in my opinion.

1 Like

I kicked this idea around a while back, and bumped on the problem of trying to have an entry potentially be part of multiple intrusive lists at once (that is, to have multiple list node structures in one parent structure). Presumably this is a fundamental limitation of such a model?

1 Like

Good point. Without surfacing language support for multiple conformances, maybe you could use view adapters, similar to how String surfaces utf8, utf16, etc. collection views?

1 Like

Yes, I think that works.

Is there any specific reason to avoid multiple conformances, or is it only a lot of work on the compiler side? I would love to have something like Haskell's newtype to create similar types with different conformances...

Swift's underlying type system and runtime model allows for multiple conformances (with some exceptions, such as as? Protocol, where it isn't clear what conformance should be chosen if there are multiple candidates), but we've thus far avoided exposing the ability in the language because it raises a lot of design issues, and could be very confusing if exposed in the wrong way (and also, because we would have to solve the problem for the aforementioned exceptions). For most intents and purposes, a single-field struct or single-case enum is effectively a newtype in Swift already.

2 Likes

There is a big caveat with this, though - exposing view adapters (single-field structs which expose a different version of a conformance, like String/String.UTF8View) doesn't play well with copy-on-write. The view adapter ends up holding an additional reference to the underlying storage, meaning mutations will trigger a copy.

Technically it is correct and preserves value semantics to do that, but it would be great if we had a way of creating an inout version of a ViewAdapter, containing a reference to the underlying collection rather than a copy. Is that something ownership could help with? Or would we need something like _modify for computed properties?

3 Likes

I agree that's something we need. In theory, it should be possible for a view adapter to be borrowed in-place over the underlying value; since the types are layout-compatible, they can share the same memory, and you should be able to shared-borrow or exclusive-borrow the wrapper over the wrapped value. If we had something like unsafeBitCast that worked as a read and/or modify coroutine, that could do the job.

4 Likes
Terms of Service

Privacy Policy

Cookie Policy