SE-0370: Pointer family initialization improvements and better buffer slices

Some existing and proposed APIs return an (Iterator, Index) tuple.

  • Can tuple labels be added or changed?
    The updated: label doesn't seem correct.

  • Can @discardableResult be added?
    This seems equally useful, whether source is a collection or sequence.


The detailed design might be improved with primary associated types.

  • C: Collection<UInt8>
  • C: Collection<Element>
  • S: Sequence<Element>

Adding @discardableResult is fair. I believe it could also be added to the old API.

The tuple labels can't be added to the old API (it can break some existing source code,) but we could possibly adjust the labels being added. However, I made the API added to Slice identical to their counterparts on U[M][R]BP, so that would mean that some of the "new" API don't get tuple labels, because the older ones they are mimicking do not. Is this the right decision?

Thanks for pointing out the primary associated types. This proposal's writing predated those.

Adding tuple labels shouldn't be source-breaking, should it?

(There is the one narrow exception I can think of where the added labels happen to match labels on the left-hand side of a tuple destructuring expression but in a different order, causing an unexpected tuple shuffle to happenβ€”but that is both unlikely and, if the proper labels are chosen, would only fix existing buggy code.)

I don't think this is a good idea -- the result indicates important failure cases, so it ought to be always checked. (At minimum in a debug assertion, such as assert(unwritten.next() == nil && initialized == target.endIndex).) Marking these functions @discardableResult would encourage ignoring these potential error cases, leading to potential data loss and/or security issues.

Should the Collection-taking overloads also return the index of the first unwritten element in the source?

    func initialize<S>(from source: S) -> (S.Iterator, Index) where S: Sequence, S.Element == Element
+++ func initialize<C>(fromElements: C) -> Index where C: Collection, C.Element == Element

The existing function returns this information using an iterator (and I agree that is not ideal; I've seen significant overheads from copying large iterators), whereas for the new function, I would need to manually calculate the number of elements initialised, then calculate an offset index in the source collection. That's another O(n) operation.

let buffer: UnsafeMutableBufferPointer<X> = ...
let source: some Collection<X> = ...

let bufferIdx = buffer.initialize(fromElements: source)
let numWritten = bufferIdx - buffer.startIndex
let firstUnwritten = source.index(source.startIndex, offsetBy: numWritten)
if firstUnwritten < source.endIndex {
  // We didn't write everything from 'source'.
  assert(bufferIdx == buffer.endIndex, "The buffer must be full")
}

If it returned that information as an index, things would be simpler:

let buffer: UnsafeMutableBufferPointer<X> = ...
let source: some Collection<X> = ...

let (sourceIdx, bufferIdx) = buffer.initialize(fromElements: source)
if sourceIdx < source.endIndex {
  // We didn't write everything from 'source'.
  assert(bufferIdx == buffer.endIndex, "The buffer must be full")
}
1 Like

I think I had dismissed the idea because of the worst-case tuple shuffle. If that is an acceptable risk, then it would be nice to add the missing labels on the returned tuples of the old API.

I think the Collection-taking overloads should indeed return something to indicate whether there still remain unprocessed items in the source. (Otherwise they too would be inviting data loss bugs.)

It's tricky though -- if they returned a collection index, then calculating that would generally introduce undesirable overhead. Unfortunately, returning an iterator wouldn't be practical, either, and returning a count (or an index into the target collection, like in the proposal) seems way too indirect to be practical.

What if the Collection overloads would simply trap if the source wasn't entirely consumed by the operation? It seems unwise to rely on callers to check this condition, as folks are prone to forget to do so, leading to potential data loss and/or other issues.

Clients who need to initialize a buffer from a partial collection can simply slice the collection before passing it to these methods.

For what it's worth, the fact that the additions from this proposal (as well as those from SE-0334 and SE-0349) are all non-ABI means that it is possible to implement identical functionality in extensions outside of the standard library. This is only the case because all the additions are in the form of functions, and no new types. This is not typical of SE proposals! I’d also comment that there are more than one way to polyfill, including internal extensions in one's own apps or libraries, in addition to a dedicated package.

This being said, the polyfill question is orthogonal to the question of whether these additions actually belong in the standard library.

2 Likes

Given that the proposal also includes these operations on UMBP slices, it would in fact make sense to go the whole way: let's get rid of the return values altogether, and simply always trap if the source collection does not precisely match the size of the target buffer.

Callers who actually do need to initialize/assign partial buffers would need to slice the source collection or target buffer before calling these methods. This would eliminate a whole class of bugs and force folks to be more deliberate about their unsafe buffer code, which seems very much desirable.

1 Like

If you partially initialise a buffer, you might not know the precise amount of content each operation will write.

For example, in WebURL, writing the normalised URL happens in 2 passes - one is a dry-run which calculates the expected overall length of the string, then each URL component is written individually. We track the length of each major component, because it will be needed later by the final object (hasn't been needed for writing purposes until now).

Sometimes writing a URL component involves several writes - for example, each path component is written to the buffer individually. We don't track the expected length of each, individual path component; only the overall expected length of the path. Since there can be an arbitrary number of path components, tracking this information would require a dynamically-resizable Array, which is a very significant overhead indeed.

 overall path length is known
 β”Œβ”€β”€β”€β”΄β”€β”€β”€β”€β”€β”  
/foo/bar/baz
 β””β”¬β”˜ β””β”¬β”˜ β””β”¬β”˜
  p1  p2  p3  - p1, p2, p3 are separate writes; lengths unknown

It would also be expensive to calculate this before each write, as these components can involve lots of lazy processing (e.g. lazily filtering tabs and newlines, lazily adding percent-encoding).

The existing APIs guarantee that they won't write beyond the bounds of the buffer, so even if there's some funny business going on (e.g. the input is a generic Collection provided by the user; their implementation might have a bug and return an incorrect number of elements) we won't over-write the allocation. Once all components have been written, we check that the entire buffer has been filled so there are also no uninitialised gaps. Even if the input Collection does have a bug, you'll get garbled content (can't do anything about that) or a precondition failure, not UB. It's really difficult to write generic code which accounts for all of these things (even bugs in generic types), but it is just about doable.

3 Likes
overall path length is known
 β”Œβ”€β”€β”€β”΄β”€β”€β”€β”€β”€β”  
/foo/bar/baz
 β””β”¬β”˜ β””β”¬β”˜ β””β”¬β”˜
  p1  p2  p3  - p1, p2, p3 are separate writes; lengths unknown

If I understand it correctly, this is an argument for only trapping if the source collection isn't entirely consumed by the operation, allowing the target buffer to have a suffix that is untouched by the operation.

My motivation in recommending trapping if the source collection is too short was that otherwise callers would need to carefully remember to compare the returned index against target.endIndex. This is true, but as your example shows, there is some real value in that:

let source: some Collection<some Collection<Foo>> = ...
let count = reduce(into: 0) { $0 += $1.count }

let target = UnsafeMutableBuffer<Foo>.allocate(capacity: count)
var remainder = target[...]
for component in source {
  let end = remainder.initialize(fromElements: component) // traps if no space
  remainder = remainder[end...]
}
precondition(remainder.isEmpty, "Underfilled buffer")

This is certainly quite a bit nicer (and much, much easier to get right) than the awkward code we currently have to write:

var start = target.startIndex
for component in source {
  let remainder = UnsafeMutableBufferPointer(rebasing: target[start...])
  var it: Component.Iterator // We can't even spell this with the `some Collection` syntax 😩
  (it, start) = remainder.initialize(from: component)
  precondition(it.next() == nil, "Overflowing buffer")
}
precondition(start == target.endIndex, "Underfilled buffer")

Forgetting to check that it.next() == nil is a very common issue, and so rolling that check directly into initialize(fromElements:) seems very much desirable.

I retract my second suggestion -- initialize(fromElements:) should only trap if the source collection is too large, not when it's too small. (This has the benefit of preserving the proposed API, just requiring a semantic change.)

(Note: While there is a similar argument for allowing copying only a prefix of the source buffer, that seems far less convincing to me -- primarily because (as we discussed) allowing that would require the method to return an index, which would complicate both the implementation and (more importantly) the API contract. Developers with such use cases will need to either fall back to the sequence-based APIs (which remain available), or properly slice their input collections.)

Forgetting to check that the returned value is the endIndex of the target buffer is still going to be an issue, but as long as the function isn't marked @discardableResult, at least there will be a warning if someone forgets to check it. (It helps that it's a simple value in this case, rather than a scary tuple.

In the slice case, there is a chance for confusing endIndex with count -- as in doing precondition(result == target.count) rather than precondition(result == target.endIndex) -- which might be worth calling out in the docs.

Thanks for the feedback so far. I’d like to use some of the feedback to make changes to the proposal.

First, I’d like to use @xwu’s suggestion of fromContentsOf: as a replacement for my proposed fromElements: argument label for the copy-from-Collection functions. It's much better.

Second, as noticed by @benrimmington, one returned tuple had the wrong label. As @xwu noted, adding labels to an existing returned tuple has an extremely small chance of being actually source breaking. Given that, I’d like to standardize the labels of all the tuple returning functions to (unwritten: Iterator, index: Index). (The two cases that return the tuple (unwritten: Iterator, initialized: UnsafeMutableBufferPointer) do not change.)

Finally and most interestingly, Karoy’s arguments are convincing me that we can make an improvement to those copy-from-Collection functions by requiring the receiving buffer to have enough space to receive every element from the source. My initial inclination was to permissively defer entirely to the user, but that does make the APIs more complicated to use in many cases. This was informed by experience with the Sequence-based API we’ve had to use before, where a requirement is stated but is spottily enforced, resulting in capricious behavior in practice, where one had to double-check the result regardless. These new API, being Collection-based, can straightforwardly require that buffer.count must be larger than source.count. This won’t change the function signatures, but will change the documentation. Note also that if a user needs the behaviour initially proposed, it can easily be composed it from available information: let index = buffer.initialize(fromElements: source.prefix(buffer.count)).

7 Likes

Actually, I think your original suggestion was good, if it only affects new APIs. I'd prefer the following to return Void, and trap if the source and target have different counts. Otherwise, it isn't clear whether the returned index belongs to the source or the target.

  • initialize(fromContentsOf:)
  • update(fromContentsOf:)
  • moveInitialize(fromContentsOf:)
  • moveUpdate(fromContentsOf:)

It seems easy to initialize or update only the target's prefix (or suffix, etc.)

target.prefix(source.count).update(fromContentsOf: source)

Note that other, pre-existing functions such as UnsafeMutableRawBufferPointer.copyMemory() and UnsafeMutableRawBufferPointer.copyBytes() already have the source.count <= destination.count precondition we are proposing here.

3 Likes

For clarification. @glessard, will the implementation actually eagerly check this precondition (with potential performance implications if count isn’t O(1)) or will it trigger a precondition failure when the destination buffer is filled and there are still more elements left in the source collection?

A goal of these is to be able to take advantage of a contiguous storage fast path, where count can be evaluated eagerly and the copy lowers to memcpy (no iteration). The fallback is to an element by element copy, where there is probably no reason to check source.count before starting.

1 Like

(Note on potential future work: As an extension to the contiguous collection case, these functions could also make good use of a future stdlib facility to directly iterate over the contiguous storage chunks of a piecewise-contiguous collection (such as deques, B-trees, HAMTs, etc. I'm hoping we'll eventually introduce public API to replace the (limited & annoyingly over-constrained) Sequence._copyContents method with something along those lines. Note: Adopting such would not require any changes in the proposed APIs, just their implementation.)

3 Likes

Guillaume has revised the proposal in accordance with his post earlier in the thread; you can see the exact changes in the PR. These changes are now live in this review; we are looking for feedback on the revised proposal.

The revisions include some changes that introduce some risk of source incompatibility (introducing tuple labels on some existing result types). The Language Workgroup is tentatively comfortable with the changes proposed here because we see the risk as very low, but we'll make sure that adequate investigation is done. I don't think we need feedback on this point unless someone has concrete evidence of problems they're seeing on real code; but if you do have such evidence, please feel free to raise it with us, even after the review period.

This proposal review is open for another five days or so; I think that should be sufficient for the community to provide feedback on the revised proposal, especially since the revisions were already raised in the thread (and in fact there's already been feedback on them). If this does spark interesting discussion leading up to the end date, we'll just keep the review going.

8 Likes

SE-0370 has been accepted. Thank you all for your contributions.

1 Like