Runtime crash when using [String](unsafeUninitializedCapacity:initializingWith:)

The following program demonstrates that a runtime crash will occur when using String (but not eg Float) with Array.init(unsafeUninitializedCapacity:initializingWith:)

func test<T>(value: T, count: Int) {
  let a = [T](unsafeUninitializedCapacity: count) { (p, c) in
    c = count
    for i in p.indices { p[i] = value }
  }
  print(a)
}
for _ in 0 ... 5 { test(value: 123, count: 4) }
for _ in 0 ... 5 { test(value: "abc", count: 4) }

Compiling and running this:

$ swiftc --version
Apple Swift version 5.2.4 (swiftlang-1103.0.32.9 clang-1103.0.32.53)
Target: x86_64-apple-darwin19.4.0
$ swiftc test.swift && ./test
[123, 123, 123, 123]
[123, 123, 123, 123]
[123, 123, 123, 123]
[123, 123, 123, 123]
[123, 123, 123, 123]
[123, 123, 123, 123]
["abc", "abc", "abc", "abc"]
Segmentation fault: 11

Is the code doing something which will result in undefined behavior or is this crash caused by a compiler or standard library bug?

EDIT: Also (as I see now), why does it print 6 lines of [123, 123, 123, 123]? I'd expect 5 ...
EDIT 2: never mind (thought I had written ..< instead of ...)

p is an UnsafeMutableBufferPointer with only uninitialized data.

Assignment operator will uninitialize the old (initialized) data. You need to initialized it first to use = with and exception of trivial types. They’re just lenient for trivial types (like Int).

Instead, use one of the initialize functions.

  • (p.baseAddress! + i).initialize(to: value)
  • p.initialize(repeating: value)

etc.

4 Likes

You’re using a closed range. (0 ... 5) has six elements.

As for the bug - I think you should be using the initialize method rather than the subscript. IIRC, the subscript is an “assign” operation, which assumes a previous initialised instance is being overwritten with the new value.

Ah, beat me to it.

2 Likes

The contract of init(unsafeUninitializedCapacity:initializingWith:) also includes

  • the values in the range of 0..<c are initialized, and
  • values in range of c... are uninitialized.

So be mindful of that as well.

Thank you! That makes sense. I ended up doing this in my original code:

(where i is the values of the correct range.)

UnsafeMutableBufferPointer also have initialize(from:) that accepts a Sequence, if that works better.

PS
They seems to specifically allow trivial type, updated the comment above.

Yup, it didn't work for my actual use case though.

Yeah, I now remember this from previous unsafe pointer related discussions. Did you see this information in some documentation?
(Not that the exception matters very much in this particular case, as it seems simpler to just treat trivial types the same as non-trivial types here, and I guess it won't be less effective either.)

Only environmentally, all APIs have the same "or X must be a trivial type" exception when talking about initialize state contract.

Though, it's missing from subscript operator. Not sure if it's an oversight or they operate under different rule.

1 Like

If a type is trivially copyable (which is what we mean by “trivial”), then it is also trivially destructible. It couldn’t have acquired any external resources (like retaining an ARCed pointer), because then it wouldn’t be trivially copyable. Since it has no external resources, no explicit deinitialisation is needed, and assigning a new value can directly initialise the memory. I’m pretty sure that’s explicitly mentioned somewhere.

Edit: Yes, the docs for the assign method say it:

The region of memory starting at this pointer and covering count instances of the pointer’s Pointee type must be initialized or Pointee must be a trivial type. After calling assign(from:count:) , the region is initialized.

Note the “or”

1 Like

Whether it needs explicit deintialization procedure doesn't really matter. The compiler could use the contract to optimize code in a subtle way. So I wouldn't hold my breath when digging into API contract.

Rather, it'd be nice to have an explicit documentation stating that initialize states only apply to non-trivial types.

I saw that too, that's what makes me think it's an oversight on the subscript documentation.