init(bytesNoCopy was deprecated in macOS 13

If there's a need for something like this, there should be a new API that actually works (and that will be challenging at best because String fundamentally does not support wrapping an external buffer).

Yes new api is needed to allow c pointer, length, encoding and memory deallocation.

I'll also add that this symmetry doesn't exist. String is not equivalent to Data , despite being a data container, just as Dictionary is not equivalent to Set . These two types should not be expected to have equivalent API surface.

Data and String can be initialized from pointer (C pointer) which is not the case for the other data containers (Set, Dictionary).This means allowing the length and the memory deallocation argument for Data and not for String is not logical. And all C pointer initializing a String are not null terminated.

1 Like

This argument isn't whole. Lots of data structures can be initialized from pointers, as all pointers can be trivially wrapped into Collection types using the Buffer pointer types.

But more importantly, "can be initialized from a pointer" does not sufficiently describe an API surface, nor does it require a specific API. The idea you're looking for is "can borrow a pointer and use it as backing storage". Swift's String doesn't have native API surface for doing this. @David_Smith suggests that String has no space in its representation for doing this, which means it is fundamentally incapable of expressing the API you've suggested.

The way String can do this is by being bridged. That is, you can construct an NSString, and then use as to bring it across into Swift. This produces a "slow" Swift String, but it will work just fine.

2 Likes

Be aware though that you won't be able mutating such a Swift string (obtained by bridging from NSString or CFString) via mutating the underlying data buffer: it is not a true "noCopy" string.

Without having spent a lot of time thinking this through, fleshing out StringProtocol and introducing a SharedString type might be the way to go.

The reason Iā€™m thinking along these lines is that, even if we did have room in String itself to do this, it would remove an important guarantee it makes, which is that its contents are guaranteed* to be valid UTF8 that canā€™t silently change behind your back.

I do think the ā€œwrap a buffer from a C library in a temporary String-like thing without copyingā€ use-case is interesting and potentially valuable, itā€™ll just require someone to do some very careful design work.

*yes technically you can break this guarantee with bridged NSStrings if you really try. I consider that a bug, itā€™s just extremely tricky to fix without breaking more important parts of string bridging.

String does have this already via __SharedStringStorage, we just don't have public API for it yet.

What's the magic combination of parameters to do that, if you know? I'd add it to my test suite.

Iā€™d need to reread the code, but basically:

  • Start with an NSString you have access to the buffer of
  • Make sure it survives -copy intact (may require a subclass, I havenā€™t checked if NoCopy will work)
  • Make sure you get lazy bridging rather than eager (being an unknown subclass is enough, since we canā€™t know if there are unusual properties that need to round trip)
  • Mutate the buffer directly

Right, but using that opens up the soundness hole I mentioned. So weā€™d probably want to wrap it in something?

1 Like

Wrap it in something to fix it up or perhaps the API to use this always fails in the case that the shared buffer is not valid utf8?

When do you re-check, though? On every access?

1 Like

Very interesting. It would be helpful if you show a working example of that (only when and if you have time for that). I tried your recipe and failed straight away trying to subclass NSString (which is not easy to subclass as it is a class cluster).