[Pitch] OutputSpan

How does that differ from this method in the proposal?

extension Array {
  public mutating func append<E: Error>(
    addingCapacity: Int,
    initializingWith initializer: (inout OutputSpan<Element>) throws(E) -> Void
  ) throws(E)
}

This is a really interesting direction. I'm curious if you have measured what sort of runtime performance overhead do you end up seeing from this approach?

I think we probably should make some API concessions to out-of-order initialization without forcing users to go all the way to UnsafeMutableBufferPointer and lose all safety. If we can make something like David's suggestion efficient enough, that would be very appealing. If not, I could also live with a combination of:

  1. an insert of a BitwiseCopyable type at arbitrary offset (including into the uninitialized region) that is bounds-checked but does not update count (this is a safe operation)
  2. an unsafe API that lets the user set count.

Obviously, the fully-safe API would be preferred, but this would be quite usable for most out-of-order initialization scenarios, and doesn't depend on any complex lifetime wrangling.

a small terminological question, as i don't immediately grasp why the proposed type name uses the word 'output'. what is 'output' intended to convey here?

3 Likes

This type is primarily a data sink, so it's an output.
(Other ideas are welcome, of course.)

1 Like

Waving my hands slightly, the three Span flavors (Span, MutableSpan, OutputSpan) correspond to in-, inout-, and out-parameters, respectively.

A Span can be read from at any valid index, but not written to.

A MutableSpan can be read from or written to at any valid index.

Both of those types maintain the invariant that all elements are always initialized, which means that there's no way to pass an "empty" one to a function or closure and say "write your output into this uninitialized region." OutputSpan's role is to fill that and related niches, hence "output". It happens to also be useful for some other patterns, however.

It could have a more "what it is" name, like "PartiallyInitializedSpan" or whatever, but in this case I think a "what it's for" name is probably more appropriate.

6 Likes

It seems to me that the purpose of this new type is to initialize a span of values, it will often be used in initializers, and its primary operation is to initialize individual elements.

Seeing as MutableSpan is named for what you can do with it, perhaps we should follow that pattern and name the new type something like InitializableSpan.

2 Likes

It doesn't, I just missed it!

1 Like

I have played around with the prototype a bit and noticed that the appendBytes(_:) method is missing to append a single BitwiseCopyable value. However, there is append(repeat:count:) which I ended up using. It feels a bit weird that the single element append has the word Bytes in the method name while all other methods don't have that, even though they all just append bytes. I think we should drop the Bytes suffix for the single element append or do we run into any ambiguity?

1 Like

I don't expect an ambiguity. The "Bytes" suffix came from UnsafeMutableRawPointer.storeBytes(), and it would be better to be consistent either with or without the suffix.

2 Likes

I haven't measured the performance but it should be minimal as it just checks a Bool if it has been initialized.

The generalized version that supports multiple elements, properly handles errors and removes would look something like this:

extension OutputRawSpan {
  /// Reserves enough storage to append `count` number of `Data` elements to `self`.
  /// `body` is called with two output spans. The fist, `insertionPromise` is required to be fully initialized
  /// before `body` returns unless the closure throws an error.
  /// The second, `output`, can be used to append further elements up to it's capacity.
  /// ``OutputSpan`` in `insertionPromise` can be initialized at any point during the execution of
  /// `body` as long as it fully initialized at the end.
  /// If `body`throws an error  `self` is effectively reset to the state before `delayedAppend` was called and the error is rethrown.
  @lifetime(self: copy self)
  public mutating func delayedAppend<Data: BitwiseCopyable, Result: ~Copyable, Error>(
    for dataType: Data.Type = Data.self,
    count: Int,
    _ body: (
      _ insertionPromise: inout OutputSpan<Data>,
      _ output: inout OutputRawSpan
    ) throws(Error) -> Result
  ) throws(Error) -> Result
}

and could be used like:

let output = OutputRawSpan(...)
output.delayedAppend(for: Int.self, count: 2) { insertionPromise, output in
    // already append elements after the promised elements
    output.append(...)
    // initialize promised elements
    insertionPromise.append(...)
}

The implementation simply checks at the end that that insertionPromise is fully initialized which shouldn't add much overhead.

We could also add convinces if only a single element needs to be added.

extension OutputRawSpawn {
  public struct InsertionPromise<Data>: ~Copyable, ~Escapable {
    @lifetime(self: copy self)
    public mutating func initialize(to element: Data) 
  }
    
  @lifetime(self: copy self)
  public mutating func delayedAppend<Data: BitwiseCopyable, Result: ~Copyable, Error>(
    for dataType: Data.Type = Data.self,
    _ body: (
      _ insertionPromise: inout InsertionPromise<Data>,
      _ output: inout Self
    ) throws(Error) -> Result
  ) throws(Error) -> Result 
}

Now that I have typed out InsertionPromise it looks very much like an approximation of an out parameter.

Full implementation for the curious ones.
extension OutputRawSpawn {
  @lifetime(self: copy self)
  public mutating func delayedAppend<Data: BitwiseCopyable, Result: ~Copyable, Error>(
    for dataType: Data.Type = Data.self,
    count: Int,
    _ body: (
      _ insertionPromise: inout OutputSpan<Data>,
      _ output: inout OutputRawSpan
    ) throws(Error) -> Result
  ) throws(Error) -> Result {
    let prefixSize = count * MemoryLayout<Data>.stride
    precondition(self.freeCapacity >= prefixSize, "OutputRawSpan capacity overflow")
    
    return try self.withUnsafeMutableBytes { (buffer, initializedCount) throws(Error) -> Result  in
      let insertionOffset = initializedCount
      let tailOffset = insertionOffset + prefixSize
      let insertionPoint = buffer[insertionOffset..<tailOffset].assumingMemoryBound(to: Data.self)
      var insertionPromise = OutputSpan(buffer: insertionPoint, initializedCount: 0)
      let tailBuffer = buffer[tailOffset...]
      // rebasing self so that no element before this point can be removed
      // as memory is currently not guaranteed to be continuously initialized
      var rebasedOutput = OutputRawSpan(buffer: tailBuffer, initializedCount: 0)
      
      let result = try body(&insertionPromise, &rebasedOutput)
      precondition(insertionPromise.capacity == insertionPromise.count, "delayed append did not initialize all promised elements")
      initializedCount &+= insertionPromise.finalize(for: insertionPoint)
      initializedCount &+= rebasedOutput.finalize(for: tailBuffer)
      
      return result
    }
  }
}

extension OutputRawSpawn {
  public struct InsertionPromise<Data>: ~Copyable, ~Escapable {
    var insertionPoint: UnsafeMutablePointer<Data>
    var initialized: Bool = false
        
    @lifetime(self: copy self)
    public mutating func initialize(to element: Data) {
      if initialized {
        // Should we allow double initialization?
        insertionPoint.pointee = element
      } else {
        insertionPoint.initialize(to: element)
        initialized = true
      }
    }
        
    consuming func finalize(for pointer: UnsafeMutablePointer<Data>) -> Int {
      guard pointer == insertionPoint else {
        preconditionFailure("InsertionPromise identity mismatch")
      }
      precondition(
        initialized,
        "forgot to fulfill insertion promise"
      )
      
      return 1
    }
  }
    
  @lifetime(self: copy self)
  public mutating func delayedAppend<Data: BitwiseCopyable, Result: ~Copyable, Error>(
    for dataType: Data.Type = Data.self,
    _ body: (
      _ insertionPromise: inout InsertionPromise<Data>,
      _ output: inout Self
    ) throws(Error) -> Result
  ) throws(Error) -> Result {
    precondition(self.freeCapacity >= 1, "OutputRawSpan capacity overflow")
    
    return try self.withUnsafeMutableBytes { (buffer, initializedCount) throws(Error) -> Result  in
      let insertionPoint = buffer.baseAddress.unsafelyUnwrapped.advanced(by: initializedCount).assumingMemoryBound(to: Data.self)
      var insertionPromise = InsertionPromise(insertionPoint: insertionPoint)
      let tailBuffer = buffer[(initializedCount + MemoryLayout<Data>.size)...]
      // rebasing self so that no element before this point can be removed
      // as memory is currently not guaranteed to be continuously initialized
      var rebasedOutput = OutputRawSpan(buffer: tailBuffer, initializedCount: 0)
      
      let result = try body(&insertionPromise, &rebasedOutput)
      initializedCount &+= insertionPromise.finalize(for: insertionPoint)
      initializedCount &+= rebasedOutput.finalize(for: tailBuffer)
      
      return result
    }
  }
}
2 Likes

If we're using closures, I don't think that we need InsertionPromise:

extension OutputRawSpan {
	/// Calls the `action` closure with this `OutputRawSpan` split in two:
	/// a fixed-size prefix that must be fully initialized by the time `action`
	/// returns, and a suffix that has no particular initialization 
	/// requirements.
	///
	/// You can use this method to initialize the contents of OutputRawSpan
	/// with a header structure followed by a tail array, when you first need to
	/// produce the contents of the array and then initialize the header
	/// accordingly.
	@lifetime(self: copy self)
	public mutating func reserving<E, T: ~Copyable>(
		byteCount: Int,
		_ action: (
			reserved: inout OutputRawSpan,
			rest: inout OutputRawSpan
		) throws(E) -> T
	) throws(E) -> T
	
	@lifetime(self: copy self)
	public mutating func reserving<E, T: ~Copyable, RerserveFor: BitwiseCopyable>(
		for type: ReserveFor.Type,
		count: Int = 1,
		_ action: (
			reserved: inout OutputSpan<ReserveFor>,
			rest: inout OutputRawSpan
		) throws(E) -> T
	) throws(E) -> T
}

OTOH, a non-escapable type that consumes the source OutputRawSpan could do away with the closures, which makes this easier to use in async contexts:

struct InsertionPromise: ~Escapable, ~Copyable {
	var prefix: OutputRawSpan
	var suffix: OutputRawSpan
	
	// (requires `prefix` to have been fully initialized, etc)
	@lifetime(copy self)
	consuming func finalize() -> OutputRawSpan
}

extension OutputRawSpan {
	@lifetime(copy self)
	consuming func reserve(byteCount: Int) -> InsertionPromise
}

(Symmetric APIs for typed prefixes left as an exercise to the reader.)

This design leverages the fact that getting an inout OutputRawSpan in a closure requires it to contain something on exit; if you consume it to create an InsertionPromise, you eventually have to consume the InsertionPromise to put it back in the inout OutputRawSpan.

3 Likes

That is quite nice, especially that it composes with async code!

One key advantage of the closure is that it can ignore the finalization depending on if an error is thrown or not. If an error is thrown the closure can just rollback everything and not validate that everything is initialized. It seems without a closure we at least need another function next to finalize(), something like func reset() -> OutputRawSpawn that just returns the original state and doesn't validate that the prefix was fully initialized.

The design of the rest of the APIs currently don't really compose with async either as all of the safe APIs only expose Output[Raw]Span in a closure that is sync with the exception of the empty and immortal initializer. Do we have an idea how we will make this work with async code as well?

1 Like

Yeah, we would need commit/rollback instead of just finalize to support that.

The proposal doesn't have any way to create an OutputSpan in an asynchronous context, but we've said in other contexts that closure-based functions are unwieldy with effects and we would prefer to have fewer of them.

If people eventually find that closures are getting restrictive, we could offer a pattern like this too:

struct ArrayBuilder<T>: ~Copyable {
	init(maximumCapacity: Int)
	var contents: OutputSpan<T>
	consuming func finalize() -> [T]
}

That doesn't play nice with @lifetime(borrow pointer) on the UMBP OutputSpan initializer since ArrayBuilder would have to hold that pointer (but I've expressed reservations before on lifetime limits tied to pointers since they're copiable).

2 Likes

The proposal doesn't explain why UnicodeScalarView is extended with the same APIs as String.

Instead of the append methods (on Array and String), could there be withOutputSpan(_:) methods (on Array and String.UTF8View) which can modify the initialized prefix and uninitialized suffix? For example:

var text = "\u{1F601}"
print(text)
// "😁"
// (U+1F601: GRINNING FACE WITH SMILING EYES)

// Use the existing API on `String`.
text.reserveCapacity(5)

// Use the proposed API on `String.UTF8View`.
text.utf8.withOutputSpan {
  // Replace the last code unit.
  $0.removeLast() //-> 0x81
  $0.append(0xB8)

  // Append a truncated scalar.
  $0.append(0xC2)
}

print(text)
// "😸�"
// (U+1F638: GRINNING *CAT* FACE WITH SMILING EYES)
// (U+FFFD: REPLACEMENT CHARACTER)

The labels of the first parameters could include "repairing" and "validating". For example:

extension String {
  public init<E: Error>(
    repairingUTF8WithCapacity: Int,
    _ body: (inout OutputSpan<UTF8.CodeUnit>) throws(E) -> Void
  ) throws(E)

  public init<E: Error>?(
    validatingUTF8WithCapacity: Int,
    _ body: (inout OutputSpan<UTF8.CodeUnit>) throws(E) -> Void
  ) throws(E)
}

(I'm assuming that the first initializer is also supposed to rethrow an error.)


Could there be safer replacements of the withUnsafeTemporaryAllocation functions?

public func withTemporaryAllocation<R: ~Copyable, T: ~Copyable>(
  of type: T.Type,
  capacity: Int,
  _ body: (inout OutputSpan<T>) throws -> R
) rethrows -> R
public func withTemporaryAllocation<R: ~Copyable>(
  byteCount: Int,
  alignment: Int,
  _ body: (inout OutputRawSpan) throws -> R
) rethrows -> R

(Apparently, stack allocations aren't compatible with typed throws.)

Simply because UnicodeScalarView should have as much as possible the same API as String does. The main difference between the two is how the algorithms operate: as a collection of Character or as a collection of UnicodeScalar. Text that is intended primarily to be parsed by a computer should usually be processed according to the simpler set of algorithms; this has been a win in the rewrite of URL, for example.

This could possibly be in addition to append. I discussed this with Karoy, and the least bad idea to name this operation was edit()… but that's not great either.

These slipped my mind. Also a good use for OutputSpan!

2 Likes

I'm personally happy with either, but if we have withOutputSpan only, it needs to be able to request a minimum amount of additional capacity at the same time.

1 Like

Could the existing reserveCapacity(_:) method be used instead? (Although the documentation warns that it can prevent "amortized constant-time performance".)

Would it be simpler to always refer to the (total) capacity, rather than the free/available/additional capacity?

  • OutputSpan has capacity and freeCapacity properties.
  • OutputRawSpan has capacity and available properties.
  • Array.append has an addingCapacity argument label.

It seems strange for UnicodeScalarView to have APIs that use UTF-8 code units. But if you want the same APIs, should UnicodeScalarView have span and/or utf8Span properties?


Should ArraySlice, ContiguousArray, and CollectionOfOne also be extended?

This is a copy-editing mistake, they should have the same name (and do in the prototype.) As for addingCapacity, we could also call it reservingCapacity or something like that.

UnicodeScalarView should have utf8Span; it can already vend a span of UTF8 code units via its utf8: UTF8View property.

ContiguousArray has the same extensions as Array in the prototype; I did not do so for ArraySlice, as it did not have the unsafeUninitializedCapacity initializer before; adding the append method definitely makes sense though. CollectionOfOne could get an initializer with the same restrictions as InlineArray<1, T>, but should it?

AFAIK, we have all the tools we need today to remove or replace trailing objects from an array efficiently, so I don't think those will be the main reasons people call withOutputSpan. The other reason they could call it for is to add new elements. If adding new elements is going to be the main reason people do this, it's more likely that you'll get it right on the first try if there's a capacity argument to the method. If you really did want to remove elements from the end, you can still request additional capacity of 0.

3 Likes

Would it be a good idea for OutputSpan to store a mutable pointer to its count instead of storing the count inline? Right now, if one passes an OutputSpan to a function, the only way to get back the count is for the function to take it as inout. The function then might unexpectedly replace the OutputSpan entirely. In contrast, MutableSpan can be passed as consuming and still work.

2 Likes