Ironing out `ManagedBuffer` API wrinkles

lorentey · August 5, 2020, 4:44am

As uncovered in PR #31686, the API of ManagedBuffer (and it's less-liked sibling ManagedBufferPointer) implicitly assume that it's possible to easily retrieve the size of a dynamically allocated chunk of memory. Platforms often allow this through functions like malloc_size, malloc_usable_size or _msize -- but it turns out not all of them do. In particular, OpenBSD evidently doesn't support this.

This makes these stdlib APIs suboptimal -- on OpenBSD, the implementation would need to bend over backwards store the allocated capacity within the class instance, even though it is practically always duplicated in Header. The extra storage would waste memory and complicate element access in every ManagedBuffer instance, even in the (hopefully) vast majority of cases where ManagedBuffer.capacity is never called outside of the closure passed to the create method.

I believe the stdlib's APIs shouldn't be needlessly difficult to implement on particular platforms. ManagedBuffer's implicit requirement for a working malloc_size seems unreasonable.

This seems like as good an excuse as any to start discussing a revision of these APIs.

Here is a list of problems I have run across while using the current APIs over the years:

(This is the motivating problem above.) It isn't possible to efficiently implement ManagedBuffer on some platforms, like OpenBSD.
ManagedBuffer.create works by passing a half-initialized object to its closure, and hoping that the closure will only access members that happen to work properly in this invalid state. This leaves me with a distinctly unclean feeling every time I have to call it -- Swift code should never need to deal with half-initialized class instances.
The return type of the ManagedBuffer.create class function is declared as the concrete type ManagedBuffer<Header, Element> rather than Self, even though the function is documented to create a "most-derived" class instance. This forces us to manually insert a dynamic downcast after practically every single invocation of it.
ManagedBuffer is designed to be subclassed, but it is polluting the namespace of its custom subclasses with public member names that could just as well be left available. (By e.g. moving them to a utility struct like ManagedBufferPointer. In fact, look, they're already duplicated there!)

This is especially problematic in the case of capacity, which would often be a desirable member name for a custom buffer. ManagedBuffer's final definition for it relies on malloc_size, which makes it "nontrivial to compute", and thus inappropriate for use in high-performance code. Managed buffer instances often end up with at least two capacity properties, capacity and header.capacity. Using the wrong one leads to sadness and bugs. (It doesn't help that the one less often useful has the more attractive name!)
We also need a ManagedRawBuffer, patterned after UnsafeMutableRawBufferPointer.
The name ManagedBuffer arguably needs an Unsafe prefix, as this construct requires manual memory management. For example, ManagedBuffer doesn't by itself know how to deinitialize its tail-allocated storage -- this must be very carefully done in a subclass.

Problem #4 would probably deserve a high-level proposal that could also complete the lower-level raw pointer APIs into a well-integrated subsystem. I think the ship has sailed for #5 at this point, unless we want to deprecate the entire ManagedBuffer API family and replace them with a far better design, which seems unlikely.

But I believe we can tweak things to resolve #0, #1, #2, and #3. First of all, we should consider deprecating the create function we have now:

extension ManagedBuffer {
  @available(*, deprecated: ...)
  public final class func create(
    minimumCapacity: Int,
    makingHeaderWith factory: (ManagedBuffer<Header, Element>) throws -> Header
  ) rethrows -> ManagedBuffer<Header, Element> { ... }
}

Instead, we should introduce the pair of functions below:

extension ManagedBuffer {
  /// Create a new instance of the most-derived class, calling `headerFactory`
  /// with a buffer pointer to the storage space that will become part of the
  /// returned class instance.
  ///
  /// The buffer's `count` reflects the actual capacity that was allocated.
  /// Depending on platform-dependent allocation behavior, this may sometimes be
  /// larger than requested. (On some platforms, determining the exact
  /// allocation size may incur nontrivial costs; if you can't make use of such
  /// "free" storage space, then it's better to call the
  /// `create(exactCapacity:headerFactory:)` variant instead.)
  ///
  /// The factory closure is allowed to initialize items in the buffer; however,
  /// if it needs to throw an error, then it must return the buffer to its
  /// default uninitialized state before doing so.
  public final class func create(
    minimumCapacity: Int,
    headerFactory: (UnsafeMutableBufferPointer<Element>) throws -> Header
  ) rethrows -> Self { ... }

  /// Create a new instance of the most-derived class, calling `headerFactory`
  /// with a buffer pointer to the storage space that will become part of the
  /// returned class instance.
  ///
  /// The buffer's `count` is guaranteed to be equal to `exactCapacity`, even if
  /// the allocation resulted in more storage space than requested.
  ///
  /// The factory closure is allowed to initialize items in the buffer; however,
  /// if it needs to throw an error, then it must return the buffer to its
  /// default uninitialized state before doing so.
  public final class func create(
    exactCapacity: Int,
    headerFactory: (UnsafeMutableBufferPointer<Element>) throws -> Header
  ) rethrows -> Self { ... }
}

As far as I can judge, these functions are exactly as powerful as the original create(minimumCapacity:makingHeaderWifth:), but they are far more convenient (and less dangerous) to use.

(ManagedBufferPointer's initializer will need to be similarly updated. This is left as an exercise to the reader for now.)

To fix the largest name pollution issue as well as the portability problem, we should consider deprecating ManagedBuffer.capacity and ManagedBufferPointer.capacity with no replacements:

extension ManagedBuffer {
    @available(*, deprecated)
    public final var capacity: Int { ... }
}
extension ManagedBufferPointer {
    @available(*, deprecated)
    public final var capacity: Int { ... }
}

I suggest leaving the rest of the ManagedBuffer members as is to minimize source compatibility issues.

Speaking of source compatibility -- the new create(minimumCapacity:headerFactory:) method would introduce an ambiguity for typical use sites that use the trailing closure syntax:

.../swift/stdlib/public/core/Hashing.swift:135:18: error: ambiguous use of 'create(minimumCapacity:makingHeaderWith:)'
    let buffer = self.create(minimumCapacity: hashTable.bucketCount) { _ in
                 ^
.../swift/stdlib/public/core/ManagedBuffer.swift:62:27: note: found this candidate
  public final class func create(
                          ^
.../swift/stdlib/public/core/ManagedBuffer.swift:85:27: note: found this candidate
  public final class func create(
                          ^

Suggestions to work around / fix this would be most welcome! E.g., can we find another name for the new variant to prevent a clash?

Do you think these problems would be worth fixing? Like and comment below, etc. etc.

lukasa · August 5, 2020, 6:04am

Oh my goodness, yes, let’s. In your list I’d like to firmly vote for (2), (3), and (4). I think (1) and (5) are important but not necessarily critical.

The fact that both of these functions take a pointer to Element in a closure called headerFactory was quite surprising to me. It took quite a while for me to work out why that was. Given that we must return a Header from this function, there is no way that I can possibly forget to do it, so should we instead call this initializer or something similar to indicate its dual purpose?

Should we move the label into the function name? createWithMinimumCapacity?

Karl · August 5, 2020, 6:10am

Generally a big +1!

I always found this really weird. The new object, which hasn't even been initialized yet, obviously has no state which could be helpful when constructing the header, so why even have this weird closure constructor?

Now, with the new initializer you propose, that closure is the only place where you know the actual allocated capacity of the buffer. But I wonder if this closure design really even makes sense. Why do you need to initialize the elements as part of creating the buffer? Unlike Array, you always have direct access to ManagedBuffer's backing storage, so you don't really gain anything by combining these operations in a single, more complex function call (which now has to consider what happens if the closure throws, etc). Another approach might be something like:

public final class func create(
  minimumCapacity: Int,
  initialHeader: Header,
) -> (Self, capacity: Int) { ... }

Also, I'm not sure what the value is in splitting the minimum/exact capacity initializers. As you say, the extra capacity is "free space" that you didn't ask for - you can ignore it if you like.

This is really important - the function signature doesn't match the documentation, and it's really important to be clear about how to use ManagedBuffer safely.

Really big +1. I don't even think this is very arguable. Also, the documentation needs to really emphasise this point. This is literally all it says on the matter right now:

Note that the Element array is suitably-aligned raw memory . You are expected to construct and---if necessary---destroy objects there yourself, using the APIs on UnsafeMutablePointer<Element> . Typical usage stores a count and capacity in Header and destroys any live elements in the deinit of a subclass.

What does "typical usage" mean? When wouldn’t I destroy live objects in my deinit? To me that reads like "lots of people are saying you should destroy elements in your deinit. I don't know if it's true, but everyone's saying it". It's just - yes; especially as it currently is (passing you uninitialised objects and so on), it needs to be named Unsafe. And this documentation needs to be more assertive.

lorentey · August 5, 2020, 6:48am

Yes, that's a much better name for it!

We don't need to do that -- although it's often convenient to initialize elements within the initializer closure, it is rarely necessary to do so. However, initializing the elements is just one reason why one may want access to the buffer in the header initializer. As I understand it, a more important reason is that the initial value for Header may depend on the location of the buffer -- e.g., when the header needs to contain direct pointers to certain buffer segments etc. (One usecase for this is heterogeneous buffers (like the storage of Set or Dictionary) -- although this particular usecase would be better served by dedicated managed buffer variants.) It is possible to work around this by using optionals or dummy pointer values etc., but a closure-based create seems less complicated in this case.

If this has proven too awkward for the simple cases, we could add a create overload that takes a direct Header value. Unfortunately adding a "real" initializer seems impossible.

The size of the extra capacity is not necessarily easy to access -- forcing every ManagedBuffer allocation to call malloc_size could cause a performance problem.

lorentey · August 5, 2020, 6:48am

I just thought of an additional problem, a potential bug in the current ManagedBuffer implementation:

When create's closure throws, the header property remains uninitialized. However, the runtime/compiler isn't aware of this, and it will run the just-allocated object's deinitializer, as usual. When the custom deinit override accesses the header property, bad things will happen.

lukasa · August 5, 2020, 6:53am

The behaviour of minimumCapacity would also be consistent with general behaviour of malloc: when you call malloc you get a buffer at least as big as the one you asked for. You generally don't get to assume anything about that larger size, though (except with things like malloc_size), and must program as though you only had the space you asked for.

If we go whole-hog and get realloc support in here too then this becomes a perfect match for the behaviour of allocators more broadly.

Karl · August 5, 2020, 7:05am

Yeah, this is what confused me - the only information that object has is its own address, but ManagedBuffer doesn't actually expose its layout (and unless the header is a type imported from C, we cannot assume its layout). The pointers used by withUnsafePointerTo{Header/Elements} are documented as only being valid for the closure scope, so the header shouldn't ever contain direct pointers in to the buffer.

Of course, this is a little bit ridiculous - every instance of ManagedBuffer does have a stable pointer to its buffer for as long as that instance is alive, but that's stdlib-internal. One thing we might consider is exposing pointers which can escape as part of a (pointer, owner) pair, or documenting that these pointers are guaranteed to be stable for a given instance.

My recent pitch for shared Substrings needed a specific entrypoint for ManagedBuffer because of the lack of escapable pointers.

Ok, that makes sense.

I would really like realloc support!

lorentey · August 5, 2020, 7:40am

This is an excellent point! If we cannot justify the closure-based create without escaping pointers, then we should give up on one (or both) of these things.

(FWIW, switching to a direct Header value would resolve the use-site ambiguity. I suspect we rather ought to give up pretending these pointers aren’t stable, though. AFAIK the real reason these particular pointers are only exposed through closure-scoped members is because any direct use requires careful use of _fixLifetime/withExtendedLifetime to prevent premature deallocation of the object. Fixing this requires some language-level improvements that are still on the todo list.)

Oh yes indeed!

lukasa · August 5, 2020, 8:18am

I want to counsel caution here: if we add support for realloc these pointers really won't be stable!

While we're here though: pointers are massive. If you've gone to the trouble of writing a ManagedBuffer, you are clearly trying to obtain some kind of performance advantage. You can likely obtain more by replacing Int-width pointers to your interior storage by smaller-type (e.g. UInt16) offsets into that storage. This is stable across any pointer shenanigans and saves space.

So maybe the "no escaping the pointers" design actually helps push us towards a better programming pattern?

mayoff · August 5, 2020, 3:22pm

Call it make instead of create.

taylorswift · August 5, 2020, 8:35pm

how would they not be stable? isn’t the whole point of a ManagedBuffer that the reference to the class instance is the same as the pointer to the buffer? otherwise you are no better off than just wrapping an UnsafeRawBufferPointer in a class.

lukasa · August 5, 2020, 8:36pm

Yup, you’re right, I’m clearly losing my mind in my old age.

benrimmington · August 5, 2020, 10:50pm

Would create(unsafeUninitializedCapacity:initializingWith:) be an appropriate method name?

I'm not familiar with ManagedBuffer, but this API seems similar to SE-0245 and SE-0263.

dabrahams · August 6, 2020, 5:20am

It was explicitly considered that the user of ManagedBuffer might not be willing to pay for the cost of the call, but not that no such call might be available. I see that the existence of the capacity property makes that assumption, and I agree it's a mistake. On platforms where no such call is available, the right thing would be to report the minimumCapacity, but because the user is allowed to invoke capacity long after initialization is complete, that value might not be stored anywhere. Yes, on FreeBSD, you could request enough extra memory to store minimumCapacity, but as you point out, that would be wasteful. Oops.

ManagedBuffer.create works by passing a half-initialized object to its closure, and hoping that the closure will only access members that happen to work properly in this invalid state. This leaves me with a distinctly unclean feeling every time I have to call it -- Swift code should never need to deal with half-initialized class instances.

I know what you mean, but technically, that's not correct. All stored properties of the object (of which you may have a maximum of zero) have been initialized, so it's a fully-initialized object with some additional raw storage. The lifetime of data in that raw storage must be explicitly managed by deinit, so all use cases involve explicitly managing lifetimes of values in raw memory.

And this is why I think your feelings of uncleanliness are not entirely justified: every direct use of ManagedBuffer necessarily involves “hoping” (well, let's be honest—it's actually “reasoning about and ensuring”) that you don't access uninitialized memory. I mean, if you're not careful, you could access any non-initialized element at any time. I'm quite certain that no alternative to the existing API that efficiently serves its use-cases will ever make that impossible. Also, let's be honest again: this is an initialization, and during initialization there's no guarantee, in the general case, that your instance is in a “valid state” (i.e. satisfies all its invariants). The Swift language prevents your stored properties from being accessed before they're initialized, but that's a far cry from saying the instance is in a valid state when you access them.

If you're unhappy about is the fact that ManagedBuffer and subclasses aren't allowed to have stored properties with the all associated guarantees offered by the language, so you have to manually manage the lifetime of the Header, I'm 100% in agreement. If I'd had the tools needed to make that possible, I'd definitely have done so.

The return type of the ManagedBuffer.create class function is declared as the concrete type ManagedBuffer<Header, Element> rather than Self , even though the function is documented to create a "most-derived" class instance. This forces us to manually insert a dynamic downcast after practically every single invocation of it.

Good point. At the time of implementation it wasn't efficient to do so, but now I think I'd use FactoryInitializable to make it an initializer.

ManagedBuffer is designed to be subclassed, but it is polluting the namespace of its custom subclasses with public member names that could just as well be left available. (By e.g. moving them to a utility struct like ManagedBufferPointer . In fact, look, they're already duplicated there!)

The thinking is that if you're trying to control API surface, you probably don't even want to expose a class' inheritance relationship with ManagedBuffer publicly, so you'll be making the class instance a private/internal member of a public struct (or, if you must, a public class). ManagedBuffer was designed to be usable for 99% of all cases without the need for a “utility struct” like ManagedBufferPointer. I'm open to a different philosophy—e.g. that anyone using something advanced like ManagedBuffer should be prepared to deal with ManagedBufferPointer in order to use it—but the API duplication had its own reasonable rationale that doesn't deserve ridicule.

Also note that because Swift declines to specialize methods of generic classes, it may be inadvisable to do very much with subclasses of ManagedBuffer. Every time I head down that road I end up walking it back and putting all the logic in a struct that wraps ManagedBuffer or ManagedBufferPointer. FWIW.

This is especially problematic in the case of capacity , which would often be a desirable member name for a custom buffer. ManagedBuffer 's final definition for it relies on malloc_size , which makes it "nontrivial to compute", and thus inappropriate for use in high-performance code.

Yes, but that's not a problem. It was anticipated that you'd just use minimumCapacity—to which you already have “free” access during initialization—if you can't afford the cost of malloc_size. The problem is that the design makes capacity reporting available indefinitely, even when the library doesn't have minimumCapacity available anymore.

Managed buffer instances often end up with at least two capacity properties, capacity and header.capacity . Using the wrong one leads to sadness and bugs. (It doesn't help that the one less often useful has the more attractive name!)

Good point.

We also need a ManagedRawBuffer , patterned after UnsafeMutableRawBufferPointer .

Why? The need for that seems rare at best, so putting it in the standard library API does not seem at all obvious to me, especially when you can easily build it on ManagedBuffer[Pointer]<UInt8> or whatever.

The name ManagedBuffer arguably needs an Unsafe prefix, as this construct requires manual memory management. For example, ManagedBuffer doesn't by itself know how to deinitialize its tail-allocated storage -- this must be very carefully done in a subclass.

Yes. I think this might have been discussed and rejected as “too late/unimportant to change,” so you might look back at history. But I'd support it.

Problem #4 would probably deserve a high-level proposal that could also complete the lower-level raw pointer APIs into a well-integrated subsystem. I think the ship has sailed for #5 at this point, unless we want to deprecate the entire ManagedBuffer API family and replace them with a far better design, which seems unlikely.

We should! These components predate the existence of Builtin.allocWithTailElems_N and those capabilities should be available to users. How many times have I said to myself, "that leads me to make an array of optionals and… then I guess I won't do that because of the 7 bytes of internal fragmentation in common cases. I wish I could make a collection that's semantically an array of optionals but that uses a bitmap to indicate which elements are alive… I suppose I could hack that on top of ManagedBuffer… but it would be too horrible for words. Sigh.”?

Too many.

We should introduce the pair of functions below:

A couple of minor variations to consider:

Use a factory init as mentioned above.
Have a single API to which that passes a (non-escaping) closure for computing the actual capacity to the caller's closure.

Speaking of source compatibility -- the new create(minimumCapacity:headerFactory:) method would introduce an ambiguity for typical use sites that use the trailing closure syntax:…
Suggestions to work around / fix this would be most welcome! E.g., can we find another name for the new variant to prevent a clash?

Just make a single API as suggested above is an easy way to prevent ambiguity.

lukasa · August 6, 2020, 6:18am

I’ve needed it a couple of times. This has mostly occurred when working with C structures that use tail-allocated array types. The difficulty there is that you end up unavoidably with utility functions that do a pointer binding dance to ensure you never use the pointer with the incorrect binding semantics. It’s just not a very good idea to have a type that is binding memory to types B and C so prominently expose a function with memory bound to type A.

3405691582 · August 6, 2020, 1:36pm

Moreover, as a side note, I always found it a little strange that ManagedBuffer would use a SIL builtin to allocate tail memory but we would go around and ask the platform to introspect the size of the allocation instead.

(Of course, something Swift-side could be tracking allocation sizes, though I don't Swift should be making memory introspection part of any guaranteed API. However, it's still really something interesting to think about.)

dabrahams · August 6, 2020, 4:02pm

The SIL builtin just serves to give ARC the information it needs, but like malloc in general, it only guarantees a minimum allocation size and doesn't report the actual allocated size back to the caller. In fact, in most implementations it gets lowered to a malloc, so it doesn't know the size.

John_McCall · August 6, 2020, 7:03pm

In general, being able to recover the capacity of the allocation seems like a necessity for every practical use-case. You wouldn't need it with a statically-sized buffer, but I'm not really sure why you'd use ManagedBuffer for that in the first place. Using malloc_size is an optimization on two fronts: it lets you avoid storing the capacity separately, and it lets you use the full available memory if that's larger than was requested. But if that optimization isn't available on a platform, or if it's excessively slow, I don't think we should consider it a non-starter to just store the capacity.

Chris_Lattner3 · August 6, 2020, 7:14pm

Separate from the API improvements, the elimination of a dependence on malloc_size seems like a pretty big change. Is this only a problem for OpenBSD, or is this a pervasive issue?

If this is primarily OpenBSD specific, would it make sense to just configure the Swift runtime to build against a non-system malloc implementation that supports malloc_size?

-Chris

3405691582 · August 6, 2020, 8:05pm

The fact that malloc_size is not present on OpenBSD spurred a discussion about whether changes could be made to ManagedBuffer and ManagedBufferPointer which would lead to being able to deal with a possible lack of malloc_size availablity coming out in the wash.

If I understand your suggestion correctly, I would personally find a requirement that Swift has a dependency on a particular allocator supporting malloc introspection rather unpalatable, especially since introspection features aren't actually part of any standard, but instead merely present on a number of platforms. Mandating a non-platform allocator to be used would be somewhat of a regression from other features that a platform allocator may give you.

I was able to make some changes elsewhere to remove the dependency without too much pain because it was fairly straightforward in those cases to just optionally track the allocation size manually if introspection wasn't available, and those uses comprised something of an optimization rather than "load-bearing" functionality. ManagedBuffer and ManagedBufferPointer are the only cases remaining. However, because ManagedBufferPointer computes capacity as part of an extension, we can't store this value; it is calculated by introspection each time it is called, so malloc introspection is an inherent part of the way these classes operate and is difficult to work around this any other way. The PR referenced above uses the actual allocation to store the capacity.