Array Initializer with Access to Uninitialized Buffer

That actually should work so long as I ensure the C code is either setting a value type or ensuring initialize is called on the memory right?

well C structs are always (mechanically) value types so yeah. of course you have to manage the lifetimes yourself, even if their backing storage is managed by the swift Array, but you already knew that

Agreed. :confused: Writing this proposal has also made me want these additions:

extension UnsafeMutableBufferPointer {
  func initialize(to element: Element, at i: Int) {
    self.baseAddress!.advanced(by: i).initialize(to: element)
  }
  
  func deinitialize(at i: Int) {
    self.baseAddress!.advanced(by: i).deinitialize(count: 1)
  }
}

Were those considered when we revamped the pointer types?

no. there was a lot of debate over supporting these APIs on buffer ranges though but we decided it was out of the scope of the proposal. personally i think it would be worth it to overload + on UnsafeBufferPointer or at least a method advanced(fromBase:) that would do

static 
func +<T> (a:UnsafeBufferPointer<T>, b:Int) -> UnsafePointer<T>?
{
    return .init(bitPattern: Int(bitPattern: a.baseAddress) + MemoryLayout<T>.stride * b)
}

and we could then use all our existing APIs on the return value

Another alternative would be a named subscript:

extension UnsafeMutableBufferPointer {
    subscript(pointerOffset offset: Int) -> UnsafeMutablePointer<Element> {
        return baseAddress! + offset
    }
}

Either of those would let me write the the stable partition without dropping down to manipulating pointers:

extension Collection {
  func stablyPartitioned(by belongsInFirstPartition: (Element) -> Bool) -> [Element] {
    return Array<Element>(unsafeUninitializedCapacity: count) { buffer in
      var low = 0
      var high = buffer.count
      for element in self {
        if belongsInFirstPartition(element) {
          buffer[pointerOffset: low].initialize(to: element)
          low += 1
        } else {
          high -= 1
          buffer[pointerOffset: high].initialize(to: element)
        }
      }
      
      buffer[high...].reverse()
      return buffer.count
    }
  }
}

i feel like it would be weird for one UnsafeBufferPointer subscript to return an element and another to return a pointer to an element,, it also flies in the face of every existing convention for the square brackets

1 Like

Since the subscript is read-only, this could also be a function:

extension UnsafeMutableBufferPointer {
    func pointer(offset: Int) -> UnsafeMutablePointer<Element> {
        return baseAddress! + offset
    }
}

extension Collection {
  func stablyPartitioned(by belongsInFirstPartition: (Element) -> Bool) -> [Element] {
    return Array<Element>(unsafeUninitializedCapacity: count) { buffer in
      var low = 0
      var high = buffer.count
      for element in self {
        if belongsInFirstPartition(element) {
          buffer.pointer(offset: low).initialize(to: element)
          low += 1
        } else {
          high -= 1
          buffer.pointer(offset: high).initialize(to: element)
        }
      }
      
      buffer[high...].reverse()
      return buffer.count
    }
  }
}
3 Likes

@tkremenek announced that this proposal is now scheduled for review. That announcement includes this pull request but that still has the inout initializedCount instead of a return value. Is this a mistake or will this be a point of discussion during review?

1 Like

The proposal and implementation both have a revision coming this week. Thanks!

1 Like

Hello, people who have been involved in this thread — I've updated the proposal one last time before the review. It now includes both the initializer and the mutating method, since there are valuable use cases for both. I think I've described the preconditions and semantics such that they minimize the issues I described above.

3 Likes

Is this the latest version? Looks good to me, and I'm glad you could bring back the mutating method. Just a few notes/questions:

  • There seems to be a mix of returning the initialised count and inout initializedCount in the document (e.g. the first example has return 5, but that might be the only place).
  • It looks like you changed your mind on this point in the final version so that the mutable version could return Result and the closure signatures would match. Is being able to return Result from the mutating method important here? The inout version is definitely easier to misuse, so I'm trying to understand the tradeoffs and use cases here. Actually, reading more carefully because this isn't mentioned in the alternatives, I guess inout is also required for your guarantees around throwing, so similar questions apply there as to how valuable it is to be able to throw (while not having to manually deinitialise anything).
  • There is a trailing “The” at the end of the “Proposed solution” section.
  • withUnsafeMutableBufferPointerToFullCapacity(capacity:_:) maybe there's no way to avoid the repetition of capacity here while preserving clarity, but it looks repetitive. On the other hand, verbosity is not really a problem (and can even be a virtue) for these low-level methods that should be rarely used or used with care.

It's good. With 2 nits so far:

  • I think the word 'full' in withUnsafeMutableBufferToFullCapacity is a little iffy.
  • the documentation for buffer in the mutating function could be a bit clearer.

I definitely don’t want to hold up this proposal so feel free to ignore this but i do have some thoughts on the new draft

  • This is obviously a very complex and easy-to-misuse API so I wonder if it’s a good idea to contort it to support the stuff like Result return values and throwing that the simpler memory APIs support. I personally think throw should be dropped since, as you said in the proposal, throwing is a really hard thing to do correctly with this API, so there are benefits to forcing people to exit through return and not throw.

This means that a user who needs to throw from inside the closure has one of two options. Before throwing, they must:

deinitialize any newly initialized instances or re-initialize any deinitialized instances, or
update initializedCount to the new count.

  • i don’t like it, but i’m fine with count being :inout if you think of an Array as conceptually a buffer pointer, a capacity, and a count, and the “Array” as an aggregate as being passed :inout to the closure. I forsee a lot of confusion between buffer.count (which is really a capacity) and count though.
withUnsafeMutableBufferPointerToFullCapacity(capacity:_:)
  • this signature is way longer than it needs to be. I think everyone is in agreement that this function’s name has to be long to communicate what it really does, but it could be made a little more compact. especially the word capacity which shows up twice in the signature. Maybe
withUnsafeMutableBufferPointerToStorage(capacity:_:) 

or

withUnsafeMutableBufferPointerToReservedCapacity(_:_:) 
withUnsafeMutableBufferPointer(reservingCapacity:_:) 

which shaves a few characters off the name at least. (yes i know “reserve capacity” isn’t right for the initializer, but it really is what this function is doing basically) The current proposed name borders on unusable,, if we assume it’s being called in an unnested method (meaning, with 2 levels on indentation) it already overruns a standard 80-character line gutter by at least 25ish characters.

    func foo()
    {
        ...
                                                                                // ← gutter stops here
        let result:Bool = array.withUnsafeMutableBufferPointerToFullCapacity(capacity: array.capacity)
        {
            
        }
    }
2 Likes

Thanks for the feedback! Updated proposal is now here.

The only change to what's actually being proposed is the method is now withUnsafeMutableBufferPointerToStorage(capacity:_:) (thanks @taylorswift!)

3 Likes

The caveats about throwing should probably be in the doc comment. I almost was with @taylorswift on dropping it, but I think it's worth leaving in just so that algorithms that use this can stay rethrows.

1 Like

Great, thanks for the update and all your work. I look forward to it coming up for review. I would only say that “unsuitable” in last part of this statement seems too strong to me:

Removing the throws capability from the closure would solve this problem and simplify the new APIs' semantics, but would be inconsistent with the other APIs in this space and would make them unsuitable to use as building blocks for higher-level operations like stablyPartitioned(by:) .

since you could juggle the error out of the closure even if it wasn't throwing itself. It would definitely be much more messy though and it's a great example of why the throwing closure is probably nicer in practice, even though the inout count is easier to misuse.

1 Like

The problem with "juggling the error" is that you can't use it to build rethrows functions, because the compiler can't prove that you'll only rethrow whatever the inner closure throws.

EDIT: Which is not the end of the world, but it is kind of unfortunate for a building block that's this low in the abstraction stack.

1 Like

I’m fairly certain that rethrows does not impose that requirement. This same concept of “juggling the error” came up last month and last winter, and DispatchQueue.sync does exactly that in the standard library by calling _syncHelper.

1 Like

That's a compiler bug, SR-680. But maybe whatever workaround we come up with for DispatchQueue can be generalized if we ever fix it. (Clearly we can't break DispatchQueue.sync.)

1 Like

Good point, I forgot that all the workarounds for this are hacks/bugs.