SE-0485: OutputSpan: delegate initialization of contiguous memory

Douglas_Gregor · May 21, 2025, 7:39am

Hi everyone,

The review of SE-0485 "OutputSpan: delegate initialization of contiguous memory" begins now and runs through June 3, 2025.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager via the forum messaging feature. When contacting the review manager directly, please keep the proposal link at the top of the message.

Try it out

Development toolchains that include this feature are available:

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

What is your evaluation of the proposal?
Is the problem being addressed significant enough to warrant a change to Swift?
Does this proposal fit well with the feel and direction of Swift?
If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available at https://github.com/swiftlang/swift-evolution/blob/main/process.md .

Thanks for contributing to Swift!

Doug Gregor
Review Manager

Nobody1707 · May 21, 2025, 1:46pm

+1 This seems perfectly reasonable to me.

xwu · May 21, 2025, 2:31pm

I have some specific questions or clarifications and accompanying suggestions—apologies if they have been addressed in pitch as my memory isn't what it used to be:

  /// Create an OutputSpan with zero capacity
  @lifetime(immortal)
  public init()

I'm not seeing any APIs to resize the capacity. If so, is this initializer a useful operation?

Could it become an attractive nuisance since users may end up with an output span that they can't really manipulate? If so, should we be offering a parallel API init(capacity:) so users can just as ergonomically start with a totally uninitialized output span with nonzero capacity?

Initializing an OutputSpan from a Sequence must use every element of the source. Initializing an OutputSpan from IteratorProtocol will copy as many items as possible, either until the input is empty or the OutputSpan 's available storage is zero.

I agree that it does make sense to have an API for those cases where the number of elements in the source isn't known a priori. But, because understanding the preconditions is essential for correct use, I worry that a stricter precondition for Sequence than Iterator is subtle without some tweaks to the shape of the API.

Should perhaps the APIs with more lenient preconditions be named something like appendUntilFilled(from:)? Should we offer such an API for not just iterators but also sequences?

On the topic of understanding preconditions being essential for correct use, the proposal doesn't specify whether the following APIs on OutputRawSpan have the "Sequence-like" preconditions or "Iterator-like" preconditions described above:

extension OutputRawSpan {
  /// Initialize the span's bytes with every byte of the source.
  @lifetime(self: copy self)
  public mutating func append<T: BitwiseCopyable(
    contentsOf source: consuming some Sequence<T>, as type: T.Type
  )

  /// Initialize the span's bytes with every byte of the source.
  @lifetime(self: copy self)
  public mutating func append<T: BitwiseCopyable(
    contentsOf source: Span<T>, as type: T.Type
  )

  /// Initialize the span's bytes with every byte of the source.
  @lifetime(self: copy self)
  public mutating func append(
    contentsOf source: RawSpan
  )

  /// Initialize the span's bytes with every byte of the source.
  @lifetime(self: copy self)
  public mutating func append(
    contentsOf source: UnsafeRawBufferPointer
  )
  // ...
}

That is, is it a precondition failure if the output raw span's suffix doesn't have enough capacity to be filled with the entire contents of the respective sequence, span, raw span, or unsafe raw buffer pointer; or, will the output raw span be appended to until filled?

Instead of the "ideal" solution above, this proposal includes append() functions in the following form:

extension OutputSpan {
  mutating func append(contentsOf: consuming some Sequence<Element>)
  mutating func append(copying: borrowing some Container<Element>)
}
extension OutputSpan where Element: ~Copyable {
  mutating func append(consuming: consuming some ConsumableContainer<Element>)
  mutating func append(moving: inout some RangeReplaceableContainer<Element>)
}

I have a quibble about the last of these four APIs (and the corresponding "actual" APIs in the proposal that use the same naming scheme):

One of the things I think we do right in the Swift collection APIs is to use fromContentsOf: in order to distinguish doing something with the container as a unit when it's passed as the argument or doing something with the elements or contents of the container.

Here, the last API—append(moving:)—reads as though it is moving the container, but as you describe in the text, it's about moving the elements inside the container.

I think, therefore, that append(movingElementsOf:) would be the correct naming here. It seems pedantic but it preserves the distinction that we have done well to date to preserve. (In this particular case, I additionally prefer "elements of" over "contents of" even though sometimes we use these interchangeably, because "elements" correctly connotes some-but-not-necessarily-all elements while "contents" can be read as referring to all the elements.)

fclout · May 21, 2025, 4:03pm

I'm +1 overall on the proposal. I think OutputSpan and RawOutputSpan solve meaningful problems. However, I'd like to call out the following points, which IMO aren't super visible from the proposal:

OutputSpan is sort of halfway between value semantics and reference semantics: it depends on existing memory and has a "reference" to it, but the number of initialized values in the referenced storage is squarely a value of the OutputSpan. One of the resulting oddities, if you'll allow C++ nomenclature, is that you can turn a "const OutputSpan" into a mutable one just with a consuming assignment from a let to a var.

A consequence of OutputSpan requiring mutability is that it must typically be passed as an inout value. However, this means OutputSpan could hypothetically also be reinitialized to somebody else's underlying storage, in which case you end up in a situation that doesn't really make sense anymore. This is intended to be caught with finalize, which requires you to pass back in the original storage pointer to confirm you have the right OutputSpan.

At the same time, making OutputSpan a reference type is not the obviously correct thing to do either, because that would allow you to alias the storage. With aliasing storage, span and mutableSpan need a runtime check to verify you have exclusive access. (It's also not a given that @lifetime currently works on reference types!)

If I understood correctly this interaction, when you don't call finalize() on OutputSpan, on deinit, it will deinitialize all elements that were added. However, it won't revert any other changes, including modifications to existing elements or removed elements. (Also, how does it do that? The detailed design doesn't seem to specify that it remembers the original count.)

glessard · May 21, 2025, 8:14pm

That's not correct. If you don't call finalize(), then it will deinitialize all elements in the memory the OutputSpan references. And that could be more elements than just was added.

Note that this function exists primarily to catch programmer errors, whether these are from the implementor of the container (author of a closure-taking function) or from the user of the container (who might have written a flawed closure.)

The need to defend against buffer replacement has been a thing since Array.withUnsafeMutableBufferPointer() was originally added. It would be useful to have a type of binding that supports calling mutating functions of the value, but doesn't support replacing a value. Swift's mutation model seems at odds with that idea, however.

benrimmington · May 23, 2025, 1:21pm

If a public init() is useful, should it also be added to Span and MutableSpan?

glessard · May 23, 2025, 1:53pm

There is some backfilling to be done with Span and MutableSpan, now that we have a story for proposing functions with non-escapable returns. Initializers are a big gap, as are the extracting methods on Span & RawSpan.

lorentey · May 28, 2025, 12:11am

extension OutputSpan where Element: ~Copyable {
  /// Repeatedly append an element to this `OutputSpan`.
  @lifetime(self: copy self)
  public mutating func append(repeating repeatedValue: Element, count: Int)
}

This variant of append needs to require Element to be copyable.

extension OutputSpan where Element: ~Copyable {
  mutating func append(consuming: consuming some ConsumableContainer<Element>)
  mutating func append(movingElementsOf: inout some RangeReplaceableContainer<Element>)
}
extension OutputSpan /* where Element: Copyable */ {
  mutating func append(contentsOf: consuming some Sequence<Element>)
  mutating func append(copyingElementsOf: borrowing some Container<Element>)
}

These are workable names! At the same time, I think adopting them would reduce the overall readability of Swift code, and it would reinforce our reputation for unnecessary verbosity. So please let me push back on this a little.

Here, the last API—append(moving:) —reads as though it is moving the container, but as you describe in the text, it's about moving the elements inside the container.

Is this going to be a real source of confusion, or an imaginary one? What would it mean to append to an output span by "moving a container"? What would it mean to insert into a ring buffer by "copying" an array? Can it mean anything else other than moving/copying their elements?

If it cannot mean anything other than that, then why would we want to keep highlighting this distinction every time we call these?

These functions (and especially their counterparts on the container protocols) will be invoked relatively often; for a certain audience, they will be as prominent as append(contentsOf:) is. I think it makes sense for these names to be smoothed down, free of unnecessary ceremony.

Nobody1707 · May 28, 2025, 8:11pm

If we do need to support that distinction, append(movingFrom:) seems like a better name than append(movingElementsOf:),

glessard · May 28, 2025, 11:22pm

If we have an Optional<OutputSpan<T>>, for example, we may need to materialize an empty OutputSpan<T> for the .none case. We end up needing this for many variants of non-owning types, and accordingly expect that when we propose initializers for Span and MutableSpan, empty initializers will be part of them as well.

Resizing is an operation performed by the type owning the memory; we've come across this issue with Kevin Perry's serialization API work (The future of serialization & deserialization APIs), and are working on a pattern to handle it.

I'm open to suggestions for the iterator-taking function; it may be that the append(from:) I suggested may be too close to "the good name." I would be opposed to a lenient form of the Sequence-taking function because this forces us to return an IteratorProtocol instance as part of a tuple, and if you have to handle an iterator it is simply nicer and more intentional to pass an inout iterator instance to the function; the provenance is clearer.

I didn't include the note in the doc-comments, but the introductory paragraph above that block says: "Initializing an OutputRawSpan from a Sequence or other container type must use every element of the source." I'll update the doc-comments for clarity.

I rearranged code without a compiler, clearly. Fixed here.

xwu · May 29, 2025, 2:45am

The real, not imaginary confusion that I actually had, and which I have again after a few days of not looking at the proposal, is the answer to the question of what is left of the moved-from container.

That is to say, the method takes the argument inout, so it'll give the container back to me, but in what state is that container and its elements returned to me? A label that makes it clear that some elements will be moved but possibly not all would make sense (or, as I've genuinely forgotten, is it the case that all the elements are going to be moved as a precondition and I'll get back an empty container?).

It's for that reason that I'm not fussed about the naming of append(copying:)—although to preserve a nice symmetry we could do append(movingFrom:) and append(copyingFrom:) as suggested by @Nobody1707—but ultimately there is no similar confusion that's possible with respect to a borrowed container of copyable elements.

AronL · May 30, 2025, 10:52am

Is there a motivation for using different names for the "same" property on OutputSpan and OutputRawSpan?

extension OutputSpan where Element: ~Copyable {
  /// The number of additional elements that can be added to this `OutputSpan`
  public var freeCapacity: Int { get } // capacity - count
}

extension OutputRawSpan {
  /// The nmuber of uninitialized bytes remaining in this `OutputRawSpan`
  public var available: Int { get } // capacity - byteCount
}

To me, it would be more natural to use the same variable name for both, and I think freeCapacity would work well for both available elements and bytes.

glessard · May 30, 2025, 5:08pm

This is an editing error; it's freeCapacity for both. There is a PR with fixes to the proposal here.

Douglas_Gregor · June 12, 2025, 8:44pm

Hey all,

The Language Steering Group has decided to accept this proposal. The acceptance involves some small modifications (see the prior link), the largest of which is deferring the bulk append operations that still have some design and naming issues to sort out. Thanks for all of the feedback here!

Doug