How to query the capacity of a `String` buffer?

is there a way to query the buffer capacity of a String, similar to Dictionary.capacity? i’m trying to debug high memory consumption, which is possibly caused by String allocations.

I don't think they do. However, it's unlikely that high memory consumption is caused by String allocations. What platform are you running on, and what's the observation that caused you to think of high memory consumption?

Can you use Instruments? It can track allocations, tell you where they came from, which type they are, etc. This is a common need when profiling software and there is tooling for it.

There is another, non-kosher way...

Alternatively, you could compile with -Xfrontend -disable-access-control to access things like self._guts._object.largeAddressBits. One problem you'll encounter is that lots of non-public symbols can't be linked to, so you can't just jump in and inspect self._guts._object.nativeStorage.capacity. You'll need to reverse engineer the checks (e.g. checking that the string has native storage, not a literal/small/cocoa string, extract the address, bitcast-ing to the correct type, etc), but it's doable. Something like this, perhaps...

extension String {
  var nativeStorageCapacity: Int {
    // FIXME: DO NOT USE
    // This is super-unsafe. It doesn't even bother to check if this is a small/immortal/cocoa string.
    // For bridged strings especially it returns total junk values.

    let ptr = UnsafeRawPointer(bitPattern: self._guts._object.largeAddressBits)!
    let capacityAndFlags = ptr.load(fromByteOffset: 16, as: UInt64.self)
    // lower 48 bits holds (capacity + 1)
    let capacity = capacityAndFlags & 0x0000_FFFF_FFFF_FFFF
    return Int(capacity)
  }
}

Running it through a quick test:

let stringA = String(decoding: ....) // Native
let stringB = String(contentsOf: ....) // Cocoa
var stringC = String(contentsOf: ....)
stringA.makeContiguousUTF8() // Cocoa -> Native

for str in [stringA, stringB, stringC] {
  print(str.count)
  print(str.nativeStorageCapacity)
}

33437
65496

38317
4388290560

169289
262104

As expected, the native capacity is in the same ballpark as the count, but greater. Cocoa strings return junk values because I didn't bother to handle them in this demo.

(And for anybody else discovering this: -Xfrontend -disable-access-control is very unstable. Experimentation is part of learning, so if you want to explore, do it, but don't actually ship code which needs this flag to build. It will likely break with some random stdlib update, and get you rejected from the AppStore, now or later. The only supported APIs are the ones marked 'public')

2 Likes

Second the suggestion to use memory profiling tools; that's the right way to address this problem and has the great virtue of working with any type.

Separately, String.capacity doesn't really make sense, because String doesn't have a "capacity" measured in its elements. We plausibly could provide an API for a possibly-underestimated utf8 capacity, but it's not obvious how useful that really would be.

4 Likes

the first thing that i tried was looking for a profiling tool that could do this automatically. unfortunately Instruments is macOS-only, and heaptrack just attributes everything to swift_slowAlloc, which isn’t very helpful from a profiling standpoint. i’ve been dealing with this problem since march.

hence, why i am trying to gather some manual statistics through hacky methods like self._guts._object.largeAddressBits.

i wish it were appreciated more just how hard swift development is when you are outside of the apple ecosystem. i know that memory profilers are the best way to debug this, the only reason i am asking is because those tools don’t work on my platform.

3 Likes

I have never used heaptrack, but its README says that it captures a stack trace per allocation, rather than just the current frame. I would expect its analysis tools to be able to charge allocations to the callers of the swift runtime lib, since that's a normal operation for doing memory analysis of C and C++ programs as well; none of this is really unique to Swift.

1 Like