Array.init(repeating:count:) Gotcha

Syre · March 9, 2021, 10:00pm

This initializer was the source of an extremely difficult to track down bug in my project.

The gotcha is of course that if you initialize a class instance in the repeating initializer, every element in the array is just a reference to the same object.

Has anyone else fallen victim to this? Maybe we could mention this in the docs if it's a common issue?

/// Creates a new collection containing the specified number of a single,
/// repeated value.
///
/// Here's an example of creating an array initialized with five strings
/// containing the letter Z.
///
/// let fiveZs = Array(repeating: "Z", count: 5)
/// print(fiveZs)
/// // Prints "["Z", "Z", "Z", "Z", "Z"]"
///
/// - Parameters:
/// - repeatedValue: The element to repeat.
/// - count: The number of times to repeat the value passed in the
/// repeating parameter. count must be zero or greater.
@inlinable public init(repeating repeatedValue: Element, count: Int)

SDGGiesbrecht · March 9, 2021, 10:14pm

Syre · March 9, 2021, 10:18pm

Thanks, although that is a slightly different question which is why I made a new post.

Nobody1707 · March 10, 2021, 1:10am

This is why Array.init(unsafeUninitializedCapacity:initializingWith:) was added. You can implement the initializer you actually want with it:

extension Array {
    init(generating element: @autoclosure () -> Element, count: Int) {
        self.init(unsafeUninitializedCapacity: count) {
            buffer, initializedCount in
            for i in 0..<count {
                buffer[i] = element()
                initializedCount += 1
            }
        }
    }
}

hooman · March 10, 2021, 1:58am

Read the documentation carefully. It is clearly stating "a single repeated value". The type of repeatedValue is the element type of the array, not a function (in this case initializer or constructor) that produces values of that type upon each invocation.

When you pass a function that produces values of element type as repeatedValue, the code will evaluate the function once, produces a value (in your case, constructs a single object) and passes it as repeatedValue. This is the normal behavior for all function arguments (in most common programming languages).

On the other hand, I have seen many people being surprised by this. If we could figure out what leads people to expect repeatedValue to accept constructor function instead of a value, we might be able to improve the documentation.

Syre · March 10, 2021, 2:12am

Maybe I should elaborate a bit on how this "got me":

I had a type that I was using with the repeating init, initially this type was a struct, so everything was great, no bugs. Eventually, I decided that this type actually needed to be a class, so I changed it, not remembering that I used it with the repeating init elsewhere. This was my tragic mistake, because as it turns out, for this particular example, this only manifested as a bug in a rather rare scenario in my project. There were no signs of a bug until months after I made the change from struct to class.

So, it's not so much that I didn't understand how the repeating initializer works, it's just that I accidentally ended up using it with a reference type!

Based on this experience, I feel that the repeating initializer is effectively an easy-to-fall-into trap.

Perhaps documentation alone is not a strong enough solution, but I'm not sure that I have a better idea.

hooman · March 10, 2021, 2:31am

Changing a type from struct into a class (and vice versa) is a pretty big refactoring. What would help in this instance is probably a smart refactoring tool with advanced static code analysis capabilities to be able to detect most such aliasing issues and at least add FIXME comments for you. You are very lucky if this is the only bug crept into your code as a result of this change.

Back to Array(repeatedValue:count:), maybe a different label instead of repeatedValue would help, but it is too late for such a change. Also, we can specifically call out this particular case in the documentation. If you have an idea, you can submit a bug report for documentation improvement, or even propose the improved documentation via a pull request.

nnnnnnnn · March 10, 2021, 4:45pm

An addition to the documentation would be welcome! This definitely isn't the first time this has tripped people up.

Nobody1707:

This is why Array.init(unsafeUninitializedCapacity:initializingWith:) was added. You can implement the initializer you actually want with it:

extension Array {
    init(generating element: @autoclosure () -> Element, count: Int) {
        self.init(unsafeUninitializedCapacity: count) {
            buffer, initializedCount in
            for i in 0..<count {
                buffer[i] = element()  <- 🚫 Subscripting uninitialized memory
                initializedCount += 1
            }
        }
    }
}

The buffer[i] = element() line writes a new instance to uninitialized memory, but the subscript is only to be used with initialized memory. This is unsafe, undefined behavior. To initialize elements of a buffer, you need to access the memory location through the buffer's base address (we obviously need a better interface for this directly on the buffer):

extension Array {
    init(generating element: @autoclosure () -> Element, count: Int) {
        self.init(unsafeUninitializedCapacity: count) {
            buffer, initializedCount in
            let baseAddress = buffer.baseAddress!
            for i in 0..<count {
                (baseAddress + i).initialize(to: element())
                initializedCount += 1
            }
        }
    }
}

Note that you can also accomplish this same "generating" behavior by calling map — this is how I normally write this: let objects = (0..<n).map { _ in MyObject() }

young · March 10, 2021, 8:11pm

How does (0..<n), a Range map to an Array? In other words, how to know the resulting Collection is an Array?

I didn't know this and I (as I just learned, unnecessary create an extra array):

Array(0..<n).map { _ in MyObject() }

Jumhyn · March 10, 2021, 8:20pm

The type signature for Sequence.map guarantees the result is an array:

func map<T>(_ transform: (Self.Element) throws -> T) rethrows -> [T]