Shared Substrings

I haven’t used Rust but I have wanted to borrow a CString in Swift before. I agree with @dabrahams that we should want to keep String and SubString safe.

I would rather we opened up StringProtocol so that it is safe to add custom conforming types. One could be an UnsafeBorrowedString type of some kind with the appropriate APIs.

3 Likes

@dabrahams has a point. As pitched, this would be incongruent as an Array that has unmanaged backing storage. Honestly I would love to have one of those for work with raw audio data, but I digress.

IMHO a closure-based API is the only viable way to achieve (most of) what is suggested here. And I think that could be very useful, despite not fulfilling the needs of something like a longer-lived MyURL type. A closure-based API would make it significantly harder to do something you shouldn’t with an unsafe (Sub)String instance.

Alternatively we could create a new (unsafe) type that conforms to StringProtocol. In this case (and maybe also even in its absence) there would a need to make more of Apple’s APIs accept StringProtocol rather than String. (I haven’t tried but is this even feasible right now? Not sure if StringProtocol has Self requirements). To me this seems to be the only reasonable option for the MyURL case. To me personally it is also the less interesting case, but I wouldn’t argue against a proposal for it by any means.

Roughly speaking, that's called Unsafe[Mutable]BufferPointer. It lacks RangeReplaceableCollection conformance and does not range-check your indices, but you can create a simple wrapper that adds that checking if you want.

Not much; you can just return it from the closure. Regardless, such an API would still violate the invariant I'm trying to defend: once correctly constructed, a SubString is safe to use in perpetuity.

StringProtocol is not meant to be used as an existential type, and it has has associated types, so it can't be. To create APIs that work on any StringProtocol, you make them generic:

func f(s: String)

becomes

func f<T: StringProtocol>(s: T)
1 Like

I appreciate your reply here @dabrahams, thanks for taking the time :slight_smile:

I was just yesterday trying to change the capacity of an UnsafeMutableBufferPointer without deallocating and reallocating the entire thing (and avoiding a manual copying step in between) like you can with Array, i.e. array.removeLast(100), but it doesn't appear possible to just change the count directly. I could probably just construct a new Unsafe..BufferPointer with the same base address and the reduced count, but is that "legal"?

On that note, it would be incredible to have an API on Array that looks like removeLast(_ k: Int, keepingCapacity: Bool) as an analogue to removeAll(keepingCapacity: Bool); I probably wouldn't need the Unsafe API at that point.

Sorry for the tangent here, back to the actual topic...

Do you mean something like

let buffer: [UInt8] = ...
var escapedString: String?
buffer.withUnsafeTemporaryUTF8String { string in
  escapedString = string // evil?
}

I wonder if it's possible to disallow that from happening at the compiler level? In any case, wouldn't this be much the same as illegally saving the buffer from any of the other withUnsafe... closures?

First of all, I would think it should be obvious that You Shouldn't Do That™ (if you do, though, I do see the significant semantic difference of the resulting type in one still being called Unsafe and not in the other); secondly, wouldn't Swift actually "do better" here by creating a copy of the String and its underlying buffer automatically?

I am not familiar with the runtime enough to know whether it'd be possible (maybe it's already implemented) to create a copy of the String and maybe even verify its UTF8-compatibility if it is escaped, but it doesn't seem impossible or undesirable to me as a layman.

I do see an edge case of someone escaping a substring instead of the String itself here. Would it be unviable to have it still reference buffer in that case? I think the major issue we're facing is that some of the suggestions in this thread are talking about making temporary strings from entirely unmanaged (unsafe) memory, which would indeed make this edge case unviable. But using the array of (U)Int8s as above it doesn't seem impossible.


This appears to be a major hurdle given how pretty much all Apple APIs work (including parts of the Swift stdlib, IIRC). Foundation's APIs changing to generically take StringProtocol, for example, seems extremely unlikely.

The possibility of using protocols with associated type requirements as if they were existential types seemed to be swimming around a couple of years ago, but maybe it's a non-starter in reality (?). Until we can do that (or change significant amounts of API), this option doesn't seem viable for most use cases. It would mean copying to String every time devs want to use the custom type for anything not defined on StringProtocol directly, which kind of defeats the purpose.

When the doc comment says "The buffer must not be mutated while there are any strings sharing it.", that is meant to include deallocation.

So, I notice a subtle difference in your requirements here: for Substring, you take issue with a "correct invocation of [the] initializer" returning an invalid instance, but for Array, the requirement is looser: it must be a "correct initialization".

I would argue that passing a closure that fails to initialize all of its elements to Array.init(unsafeUninitializedCapacity:initializingWith:) is not a correct initialization, and it results in the same kind of "unsafe instance" as a shared Substring whose owner cannot guarantee the memory's lifetime. Yes, the objects are of the correct type and return from the initializer without any failure indication (so I guess they are "apparently correct invocations"), but they don't do what the API says they need to do.

And the issues with one are not easier to spot, nor easier to debug than the other. The buggy closure might fail to initialize any element of the Array, so perhaps users wouldn't even notice until much later in their programs. Array does not guarantee that all of its memory is correctly constructed after this call (because it is partly relying on the user), and any operation like calling .first or .dropFirst(5).first might in fact access uninitialized memory.

Firstly: this doesn't change anything in practice about Substring. The source of its data might not always be a String-owned buffer any more, but it will be a String-managed buffer, which ARC ensuring that the owner object lives for at least as long as last Substring. Nothing changes for users of Substring.

Since I'm describing deallocation as a mutation, there is a single fundamental requirement for safety: all other references to the buffer must copy before mutating while owner is shared. This implicitly means they have to check that owner is not multiply-referenced before deallocating buffer. If you get a buffer from C and there are no other references, or they never mutate/deallocate - great! You may want to free the memory afterwards to be a good citizen, but that situation will never be unsafe.

Some things we can take just from this:

  1. Substrings constructed from shared Arrays are always safe.
  2. ManagedBuffers are safe if they use copy-on-write.

Is it 100% bullet-proof? Of course not. But at that point we're talking about things like over-freeing using ManagedBuffers using Unmanaged, and those are rare use-cases with pretty-much nothing anybody can do about them. We have very good tools which can help detect use-after-frees, and anybody that deep in unmanaged and unsafe APIs should be well aware of the need to test what they've created.

This thread has lots of discussion about the use-cases, and the effect on performance would be dramatic. This performance can be unlocked in a safe way, and it is not very difficult to do. I'd say it's harder to mess up than Array.init(unsafeUninitializedCapacity:initializingWith:).

OK, I understand your intent. As a technical matter, mutation, deallocation, and the ending of lifetimes are orthogonal, and it's the latter thing that you want to ask the user to guarantee does not happen until the SubString is deallocated.

So, I notice a subtle difference in your requirements here: for Substring, you take issue with a "correct invocation of [the] initializer" returning an invalid instance, but for Array, the requirement is looser: it must be a "correct initialization".

That difference in phrasing was not intended to convey any difference in meaning, and I can't imagine what difference you could read into it. I never mentioned “returning an invalid instance.” I'm concerned with an invocation returning an instance that later becomes invalid.

I would argue that passing a closure that fails to initialize all of its elements to Array.init(unsafeUninitializedCapacity:initializingWith:) is not a correct initialization,

Agreed.

and it results in the same kind of "unsafe instance" as a shared Substring whose owner cannot guarantee the memory's lifetime.

I assume that by “owner” you mean the thing passed as the owner parameter, whose doc comment says:

If the word “optional” is to have any meaning at all, it has to mean I could could pass something that's not really an object (e.g. Int.self as AnyObject), so I can build a SubString that is backed by unmanaged memory. If that's correct usage, it creates an instance that can become invalid to use at any time.

Firstly: this doesn't change anything in practice about Substring . The source of its data might not always be a String-owned buffer any more, but it will be a String-managed buffer, which ARC ensuring that the owner object lives for at least as long as last Substring.

If that's the API you intend to propose, you need to change more things in the doc comment, including getting rid of the word “optional.” Especially for an unsafe API, it's crucial that it be documented rigorously, and that the conditions for using it correctly are simple to understand. I'm sorry to be hard-nosed about this, but the burden really has to fall on the proposer to get that right.

Since I'm describing deallocation as a mutation, there is a single fundamental requirement for safety: all other references to the buffer must copy before mutating while owner is shared. This implicitly means they have to check that owner is not multiply-referenced before deallocating buffer.

I don't understand what multiple references have to do with it. If even a single reference exists to the owner, you have to assume there's a SubString somewhere that depends on the owner.

If you get a buffer from C and there are no other references, or they never mutate/deallocate - great! You may want to free the memory afterwards to be a good citizen, but that situation will never be unsafe.

A typical C API may pass your callback a buffer that's only good for the duration of the callback's execution.

Presumably, you need the SubString so you can pass it to some API. That API is free to copy and store the SubString as long as it wants, and if it doesn't do so today it's free to start doing so tomorrow, and it won't tell you when it's done with the instance. Unless you have a way to keep the bytes in the buffer alive until that hypothetical instance goes away, you've created an unsafe instance.

This is different from the Array case because the guarantees you are asking for are dynamic properties of code that has yet to be executed at the time of construction, and cannot usually be given by the code constructing the SubString or by the SubString's own initializer.

This is qualitatively different in the ways I've described. It sounds to me like you're minimizing the importance of, or simply failing to recognize, the differences. Until we're actually acknowledging those differences and—as a community—making a rational choice about whether we want the library to go in that direction, I'm going to be opposed to it.

I'm all for unlocking performance. That said, I seriously doubt there are enough SubStrings running around existing APIs that this change would enable a drop-in speedup. Instead, many Strings would need to change to SubString. At that point, we may as well start talking about making APIs generic over StringProtocol so that when you really need performance you can pass unmanaged bytes that don't even incur ARC overhead when copied.

That's not a capacity change but a length change. Yes, that would totally be legal.

b = .init(rebasing: b.dropLast(100))

will do what you want. You can create an extension method if you'd rather spell it b.removeLast(100).

On that note, it would be incredible to have an API on Array that looks like removeLast(_ k: Int, keepingCapacity: Bool) as an analogue to removeAll(keepingCapacity: Bool) ; I probably wouldn't need the Unsafe API at that point.

I don't understand what you expect that to do. removeLast already keeps capacity, and if you really want to discard all excess capacity you can always a.reserveCapacity(0), though it's expensive: it has to allocate new memory and copy all the array elements.

Such APIs normally allow you to do it this simply:

let escapedString = buffer.withUnsafeTemporaryUTF8String { $0 }

I wonder if it's possible to disallow that from happening at the compiler level?

Yes, language features supporting pinned, non-copyable, and non-escaping instances of existing types is an interesting direction that might make it possible to use safe types with unmanaged backing stores. But that's a huge project that requires its own proposals.

In any case, wouldn't this be much the same as illegally saving the buffer from any of the other withUnsafe... closures?

No, those closures all get passed unsafe types.

Those APIs already traffic in String, not SubString, so this proposal has the same problem.

This hits the nail on the head with my main objection as well. Trying to reïnvent Substring conceptually as something other than "a slice of a String" simultaneously with the pitched APIs provides no real advantages to creators of these externally-backed strings and makes the results harder to use, because of the current design of Substring and of other APIs that consume Strings. That conceptual change is out-of-scope for the APIs being proposed here; if folks want to propose overhauling the Swift string APIs to reduce the friction between String, Substring, and StringProtocol, then those ideas ought to be discussed holistically in their own API-design-focused proposal.

My draft implementation linked in the first post had a single goal in mind: remove the performance hit for Strings when interop-ing with C APIs that transfer ownership of a block of ASCII/UTF-8 bytes to the caller. In order for that to succeed with the APIs we have in Swift today, the type used to represent this must be String, because that is the type that most libraries/clients are going to be working with.

I do agree with concerns about APIs that have the potential to create unsafe Strings that corrupt or blow up later on during execution, but I think there's room for APIs like this to exist as long as they're named
and documented such that the unsafety is clear and we show how to use them properly, as we do with the uninitialized Array initializer. If a Swift API designer is wrapping a C API that returns a char buffer and they trust that the C API is implemented correctly w.r.t. transferring ownership of that memory to them, then I think it's reasonable for the Swift author use an API such as this (but with String, not Substring) to relay that trust and avoid further costly allocations/copies.

2 Likes

But, do you appreciate the ways in which this is different from the Array initializer? It seems to me such a change would totally undermine the claim that memory-unsafe APIs are always labeled as such. Any API trafficking in SubString (or String in your world) could become unsafe without ever using such a labeled operation. Why is that OK?

1 Like

Without weighing in on the wider proposal too much: it is never guaranteed that unsafe APIs manifest their undefined behaviour at their point of use. Indeed, much of the problem with unsafe APIs is that they manifest undefined behaviour far away from the point of use, in otherwise safe code.

Consider Unmanaged, which allows you to freely modify the retain/release counts of Swift objects, potentially causing them to be deallocated while strong references exist elsewhere. This can lead to unsafe behaviour far away from the usage site of Unmanaged.

With that said, I also agree that the proposed functionality that @allevato is referring to is fundamentally different to the unsafeUninitializedCapacity initializers on Array. Any API that relies on borrowing a pointer absolutely must be accompanied by an owner parameter that can be used to manage the lifetime of the pointer. This is fundamental, IMO. Array does not have this problem because it never borrows memory, but the proposed Substring initializer does. I don’t mind APIs of this shape (with an owner parameter), but we need to be cognizant of their behaviour.

4 Likes

Yes! This is what I have been trying to communicate:

No type can guarantee no safety violations in the face of buggy code elsewhere in the system wreaking havoc via unsafe APIs. Fundamentally I don't agree that safety exists at the type level: types simply are an abstraction for grouping code and its related data, and while you can combine safe code to create systems which are some degree of "safe", as soon as there is any unsafe code anywhere in the system, the whole thing becomes potentially unsafe.

That said, I’m not ignorant of how this is allows different kinds of bugs to manifest as unsafe behaviour. But isn’t that also true of bridged NSArrays? Depending on how the bridging is implemented, bugs on the Objective-C side could easily amount to unsafe operations once bridged to Swift.


For those wanting to bridge C Strings, what about the following API as a compromise to creating shared Strings directly?

The nice thing about this is that it gets cleanly separated from the unsafe operation (making a string/substring referencing arbitrary memory). It is a broadly useful API, beyond simply sharing C strings, and might help users struggling with our split String/Substring model.

I understand that Substring is not the most convenient thing to result from using this API (I'm well aware how rare it is to see an API giving out Substrings to anything), but I think something like this could be a workable compromise.

I assure you I am well aware and took that point fully into consideration before posting. This API is still fundamentally different from other unsafe APIs on safe types.

That's a good start.

It sounds like you still may not appreciate part of the way in which this API is different, though. Once the Array initializer completes, there's nothing you can do to corrupt the instance that doesn't involve something that will obviously create undefined behavior, operating on the Array instance itself, like:

withUnsafePointer(&a) { $0.destroy(1) }

Before invoking the proposed SubString initializer, on the other hand, there is at least one extant unsafe reference to what will become its internals. That unsafe reference is effectively “leaked” by the initializer, rather like what you'd get from this:

extension Array {
  /// Creates an instance containing `source` and replaces `leak` with a
  /// pointer to the contained element.
  init(_ source: Element, leak: inout UnsafeMutablePointer<Element>) {
    self.init(CollectionOfOne(source))
    leak = withUnsafeMutableBufferPointer { $0.baseAddress }
  }
}

although at least in this case, the leaked value has its source in an initializer that we can easily.label “unsafe” by naming. The proposed SubString initializer could easily be used to initialize multiple instances referring to the same buffer. It's not clear whether the buffer memory is then mutable, and if so, what—if anything—makes such usage illegal. It's very difficult to clearly spell out the things you can do safely with an initializer like this, and the proposal as pitched certainly doesn't do that.

To be clear: I don't “mind” APIs of this shape either. I just think we should be extremely circumspect about introducing one. It changes the landscape in a fairly profound way and it certainly shouldn't be done casually.

Ok, so boiling this down concretely, what you specifically, concretely mean when push comes to shove is that the way that this API is different to the one on Array is that you can create multiple substrings pointing to the same backing store, in such a way that they can violate the value type guarantees of the implementation of Substring. Is that about right?

If it is, I understand your argument, though I don’t think it’s anywhere near fatal, for two reasons. Firstly, there is prior art in the form of Data(bytesNoCopy:). This initializer creates a Data that does not own its storage. For a Data implemented in this way, any mutating operation that must modify the backing store (i.e. that would be guarded by an isKnownUniquelyReferenced call) automatically CoWs to new storage. Essentially the implementation assumes that the data is always multiply referenced.

This would also be safe for Substring, though of course we can go one better. The AnyObject that is associated with the lifetime of the backing storage can be required to be unique for any shared backing storage, such that a call to isKnownUniquelyReferenced on that object behaves as intended.

With all of that said, I don’t think litigating this is terribly useful. I don’t object to the API shape, and we can clearly document our way out of this issue. Unsafe APIs force their programmer to meet certain contractual requirements, and this one is no different in that regard.

My objection is much more foundational: I don’t think I want String or Substring to get any more complex! The type is already complex enough, and each new case in the backing implementation further slows down the code paths for basic type operations. I’d much rather see a new type to encode this idea than to further overload these two.

Roughly, but if we want to be precise, that's too specific. The unsafe pointer passed in is generally available for code to do whatever it wants with it. One way to ban that might be by documenting that using the initializer invalidates the passed memory as-if by a call to UnsafeMutablePointer.deallocate. I'm not sure whether the proposer is willing to live with the restrictions implied (e.g. you don't even get to read the memory after the initializer is called), though. Regardless, we'd need some text that formally implies the set of things you can do after a correct invocation of this initializer without invoking undefined behavior.

The way this is different from the unsafe Array initializer is that the Array initializer's documentation doesn't need need any such text. The things you can do without invoking undefined behavior are all implied by the usual language rules.

If it is, I understand your argument, though I don’t think it’s anywhere near fatal, for two reasons. Firstly, there is prior art in the form of Data(bytesNoCopy:) .

Sorry, but anything that comes from Foundation, C, or ObjC—all of which predate Swift—doesn't set precedent as far as I'm concerned. This style of API is “grandfathered in” to allow interop with existing code, but it doesn't follow the principles of the standard library.

Near as I can tell, this proposal doesn't need to add a case to the backing implementations, FWIW.

The proposal makes it clear that the created instances are immutable. This is already in String's implementation - it won't mutate in-place unless it has native (i.e. String-created), unique storage.

Shared strings are already part of String’s implementation and have been in the ABI since 5.0. The comments say that it is being used for bridged literals, so it seems there are reasons to have it besides user-created shared strings.

I’m not sure that it is really feasible to extend StringProtocol at this point; lots of code relies on those being the only conforming types. It would be an interesting option to explore for strings with a fixed or maximum length, but it would also make code which uses this feature even less convenient to use. I don't think it's a complete substitute for this feature.

Yeah, this would basically be a non-starter for types which want to manually lay out their code-units (e.g. in a ManagedBuffer).

Anyway, I'm certainly hearing your message that the method names and documentation, as written, are not clear enough for you to support. I've spent a bit of time rewording it to be more explicit about exactly when it is safe to mutate/deallocate the buffer:

extension Substring {

  /// Creates an immutable `Substring` which references the UTF-8 data
  /// at the given buffer address.
  ///
  /// - Warning: If there are any other references to the memory in `buffer`,
  ///  it is crucial that they do not mutate or deallocate that memory
  ///  while there are any `Substring`s referencing it.
  ///
  /// The `owner` argument, if it is given, is retained by the created `Substring` and any copies of it
  /// and released as those objects are destroyed. The `owner` object thus communicates
  /// whether there could be any `Substring`s referencing `buffer`. Specifically:
  ///
  ///   - If there are no other strong references to `owner`, its deinitialization
  ///     means there are definitely no `Substring`s referencing `buffer`,
  ///     and `buffer` may now safely be mutated or deallocated.
  ///   - If a strong reference to `owner` is unique, there are definitely no `Substring`s
  ///     referencing it, and it may safely be mutated or deallocated.
  ///   - If a weak reference to `owner` is nil, there are definitely no `Substring`s
  ///     referencing it, and it may safely be mutated or deallocated.
  ///
  /// If none of these conditions are met, it must be assumed that there
  /// are live `Substring` instances referencing `buffer`, and any other references
  /// may only be used for reading.
  ///
  /// This initializer does not try to repair ill-formed UTF-8 code unit
  /// sequences. If any are found, the result of the initializer is `nil`.
  ///
  /// - Parameters:
  ///   - buffer: An `UnsafeBufferPointer` containing the UTF-8 bytes that
  ///     should be shared with the created `Substring`.
  ///   - owner: An optional object that communicates whether any live
  ///     `Substring`s reference `buffer`.
  ///
  public init?(unsafeSharedStorage buffer: UnsafeBufferPointer<UInt8>, owner: AnyObject?)
}

(and similar for Array/ManagedBuffer, but let's focus on one for now)

That doesn't really change anything fundamental, since a pointer to the String's supposedly-immutable storage is effectively being leaked by the API.

I find that statement both alarming and implausible. StringProtocol was introduced with the intention that there would eventually be other models of the protocol; if that never happens, it has failed in its mission. The kind of code that could be relying on the fact that there are no other models is really ugly, and involves as? casting an instance of StringProtocol-constrained generic type to String and SubString. Are you saying that's common?

Thanks for the revised documentation; I'll take a look when I get a moment.

1 Like

I don't understand at all why you'd say that. What I suggested only affects what a user is allowed to do with the unsafe pointer after the [Sub]String is initialized. You' free to initialize the bytes in that memory any way you choose before [Sub]String.init is invoked.

Invalidating the memory is a mutating operation. For a type like MyURL which uses a ManagedBuffer's tail-allocated storage for code-units, it would mean all copies become invalid:

struct MyURL {
  var storage: ManagedBuffer<Header, UInt8>

  var serialized: String {
    return String(allowingLongTermStorageOf:
      Substring(unsafeSharedStorage: storage,
                              range: 0..<storage.header.count)!
    )
  }
}

let urlA = MyURL(...)
let urlB = urlA
print(urlB.serialized)
// urlA now points to invalidated memory.

Working around this requires re-introducing the very copies that shared strings are meant to avoid.

This is very different from the C buffer use-case: in this scenario, we actually do want to share the buffer and to reclaim uniqueness once the sharing is over.

Perhaps not common, but anybody spending much time doing low-level text processing with StringProtocol is going to be quick the notice the lack of methods such as withUTF8 (it nearly made it, but was left out to save witness table entries). I've seen force-downcasting used to implement that particular method in maybe 3 or 4 entirely unrelated projects (not to mention this rather juicy-looking patch for the standard library). Considering the kind of code we're talking about and how rarely I go digging around in the bowels of other projects, I consider that to be a relatively high occurrence.

EDIT: Elaborating on that patch because it's quite interesting. Had it been accepted, and because the function with the withUTF8 hack is inlinable, AFAIK we would never be able to add another conformance to StringProtocol. That forced downcast to Substring might have been inlined in to apps and libraries, which would trap when trying to parse integers from the new StringProtocols. That patch was reviewed by multiple, expert-level Swift developers who have worked extensively on the standard library, and not one of them caught it. Given that, how many errors of this type might exist in 3rd-party libraries? Might any of Apple's stable OS libraries be making similar assumptions?

Perhaps it was a mistake for StringProtocol's documentation to say:

Only the String and Substring types in the standard library are valid conforming types.

I suspect that lots of other code is probably interpreting that as a guarantee that new conformers will never exist.

I know what you're getting at, but technically, as I tried to point out earlier, I'm pretty sure it is not.

For a type like MyURL which uses a ManagedBuffer's tail-allocated storage for code-units, it would mean all copies become invalid.

True.

Working around this requires re-introducing the very copies that shared strings are meant to avoid.

Depends on your use-case, which is why I said I didn't know if that restriction would be acceptable to you… but yeah, I see why it's a problem for the use you show here. Dealing with this properly may require introducing the missing parts of ownership into the type system.

Perhaps not common, but anybody spending much time doing low-level text processing with StringProtocol is going to be quick the notice the lack of methods such as withUTF8 (it nearly made it , but was left out to save witness table entries).

*@dabrahams reads from The Big Book of Bad Words, © 1981 Profanity & Sons*

I've seen force-downcasting used to implement that particular method in maybe 3 or 4 entirely unrelated projects (not to mention this rather juicy-looking patch for the standard library). Considering the kind of code we're talking about and how rarely I go digging around in the bowels of other projects, I consider that to be a relatively high occurrence.

*@dabrahams is forced to agree*

I think I'd probably choose to break code that makes that assumption, but I understand that it's a judgement call.

2 Likes

My knowledge of things like move-only types aren't up to what they should be so please correct any false assumptions I have here, but for the use case I'm primarily interested in (Swift calling a C API that transfers ownership of a const char * that I want to put into a String with an owner who frees it when it goes out of scope), would it be correct to say that we need a way to express that:

  • the C API returns a unique/move-only pointer to the buffer
  • the String initializer takes a unique/move-only pointer to the buffer

...and that would alleviate the concerns of poking a hole in the safety model where the pointer could leak through?

In that situation the owner: AnyObject model falls apart because as currently pitched the owner is a caller-provided object that also retains a reference to the buffer so that it can free it, but since the ownership manifesto makes reference to move-only types being able to provide a deinit, then a spearate owner wouldn't be necessary in that case; we'd just need to provide a way for the unique pointer to know what it should call to deallocate the memory it points to.

1 Like