ContiguousArray<Int> unexpectedly allocates discontiguously?

JamesWidman · October 12, 2023, 11:51pm

Hi all,

Given this:

var a: ContiguousArray<Int> = [ 11, 22 ]

... and with this program paused in LLDB at some point where a is in scope, the LLDB command:

frame variable --location a

produces output like this:

0x00000001000083b8: (ContiguousArray<Int>) a = 2 values {
0x0000600000521610:   [0] = 11
0x00006000005181b0:   [1] = 22
}

...which is confusing, because I expected that, in a ContiguousArray<Int>, if element 0 is located at address p, then element 1 would be located at address p + 8 (because MemoryLayout.stride == 8), because (i thought) that's what "contiguous" means: that there is no gap between any two adjacent elements.

What am I missing?

version info: macOS Ventura 13.6 (22G120)

% lldb --version
lldb-1500.0.22.8
Apple Swift version 5.9 (swiftlang-5.9.0.128.108 clang-1500.0.40.1)

John_McCall · October 13, 2023, 12:57am

I think the most likely thing is some confusion/bug about how the debugger is printing the addresses of these elements.

ksluder · October 13, 2023, 4:18am

Indeed. 0x0000600000521610 is a very odd-looking pointer for arm64.

The elements themselves are stored in the tail-allocated storage of a ContiguousArrayStorage object. I couldn’t find the code in the lldb repo that constructs the in-memory representation of Swift collections like ContiguousArray, but I suspect that code is using bogus location values, perhaps out of necessity to make Swift’s collections fit lldb’s classically-C++ understanding of arrays.

JamesWidman · October 13, 2023, 4:22am

i trust that that's true...

at the same time, this:

(lldb) frame variable --depth 256 --location  --raw-output  a._buffer._storage 
scalar: (Swift._ContiguousArrayStorage<Swift.Int>) a._buffer._storage = 0x0000600000c5aaf0 {
scalar:   Swift.__ContiguousArrayStorageBase = {
scalar:     Swift.__SwiftNativeNSArrayWithContiguousStorage = {
scalar:       Swift.__SwiftNativeNSArray = {}
    }
0x0000600000c5ab00:     countAndCapacity = {
0x0000600000c5ab00:       _storage = {
0x0000600000c5ab00:         count = {
0x0000600000c5ab00:           _value = 2
        }
0x0000600000c5ab08:         _capacityAndFlags = {
0x0000600000c5ab08:           _value = 4
        }
      }
    }
  }
}

...leaves me feeling stumped about how to proceed wrt inspection of the Swift.__SwiftNativeNSArray subobject.

Presumably, it contains a pointer. If the program being paused/inspected were a C program, i would:

(lldb) expr -- *(T*)a.ptr

(Though, in C, there wouldn't be a base subobject.)

But i'm not sure how to get LLDB to print something useful about the Swift.__SwiftNativeNSArray base subobject of a._buffer._storage (and i used --raw-output, which you'd think would at least reveal the bit pattern of the pointer).

(note, i'm not trying to use the debugger to locate a bug; i'm just using it to try to build up an understanding of how ContiguousArray works.)

ksluder · October 13, 2023, 4:27am

It doesn’t. The pointer is implicit—it’s a pointer directly past the end of the header (countAndCapacity). You can see how ContiguousArray gets that pointer by using Builtin.projectTailElems here: https://github.com/apple/swift/blob/main/stdlib/public/core/ContiguousArrayBuffer.swift#L256

Good discovery that the storage is actually also a bridged NSArray. That implementation lives here: https://github.com/apple/swift/blob/main/stdlib/public/core/SwiftNativeNSArray.swift

jrose · October 13, 2023, 5:35am

Sidestepping LLDB for a second, you can see the current pointer used by ContiguousArray by using withUnsafeBufferPointer:

expr a.withUnsafeBufferPointer { print($0) }

The documentation here still says

The pointer passed as an argument to body is valid only during the execution of withUnsafeBufferPointer(_:) . Do not store or return the pointer for later use.

so, don't do that, but in practice it's going to be the current storage for the array.

JamesWidman · October 13, 2023, 6:26am

ah, cool!

So naturally the next thing i try is:

(lldb) expr -- Builtin.projectTailElems(a._buffer._storage, Int.self)
error: <EXPR>:3:1: error: cannot find 'Builtin' in scope
Builtin.projectTailElems(a._buffer._storage, Int.self)
^~~~~~~

...which, using Builtin seems like a reasonable thing to do in a debugging context...

(but only in a debugging context, since you probably don't want any long-lived, non-stdlib src depending on Builtins)

i mean, we can already touch registers in a debugging context, which is not something you're normally able to do in a swift src file.

(though, for the specific example above, i guess there's less motivation to allow Builtins since we can use withUnsafeBufferPointer in the debugger (as @jrose suggested), which returns the result of Builtin.projectTailElems())

ksluder · October 13, 2023, 6:44am

The Builtin module is only visible when building the standard library, which allows the stdlib implementation to be revlocked to the semantics of the compiler it is built with.

I don’t even think it’s a real module; I believe the compiler replaces references to Builtin symbols with literal SIL. Thus, if Builtin were available in the debugger, it would be from the version of Swift that the debugger was built with, not the one the host system’s standard library was built with. That could be problematic.