The pitfall of malloc_size for porting and its necessity

I want to ask about this before I send a PR or two, because I'm not exactly confident what's happening here.

Let's talk about ContiguousArrayBuffer first, because it's the simplest. In init, we have

      _storage = Builtin.allocWithTailElems_1(
         _ContiguousArrayStorage<Element>.self,
         realMinimumCapacity._builtinWordValue, Element.self)

      let storageAddr = UnsafeMutableRawPointer(Builtin.bridgeToRawPointer(_storage))
      let endAddr = storageAddr + _swift_stdlib_malloc_size(storageAddr)
      let realCapacity = endAddr.assumingMemoryBound(to: Element.self) - firstElementAddress

      _initStorageHeader(
        count: uninitializedCount, capacity: realCapacity)

Now, the use of malloc_size here is somewhat objectionable, since when porting, not all platforms are going to support this, and its use, e.g., in Linux, is described as "for debugging and introspection." These uses of malloc_size also feels questionable since we have abstracted memory allocation with Builtin.allocWithTailElems but then we do an end-run around SIL here and ask the system what it's done, assuming that allocWithTailElems's allocator is going to be the same as the system's, and that raises more questions than answers.

But in fact, it seems like this is all unnecessary. I would assume that Builtin.allocWithTailElems will always allocate at least realMinimumCapacity * MemoryLayout<Element>.stride bytes in addition to the memory required for the thing itself (that is, type paraphrasing, firstElementAddress - storageAddr).

So am I missing something, or why shouldn't we just let realCapacity = realMinimumCapacity? Sure, the system allocator may allocate you more memory than you need, but it ought not allocate you less than you required, and malloc_usable_size(3) for example says "Although the excess bytes can be overwritten by the application without ill effects, this is not good programming practice: the number of excess bytes in an allocation depends on the underlying implementation."

But even that's the easy case; ManagedBuffer is far trickier. If you don't have malloc_size afforded to you by the platform, to avoid invoking it, the simplest solution is to track allocations somewhere in a global, static context -- map from pointers to sizes whenever allocWithTailElems is called. Either that, or Header needs to obey some protocol/contract about storing capacity -- which would require an evolution process, because it's public API -- or track the tail-allocated capacity as a separate field, but ManagedBufferPointer is rather aggressive about what stored properties exist alongside everything (c.f. _checkValidBufferClass). My Swift-fu is not yet strong enough to provide strong answers these questions, I'm afraid...

Any use of malloc_size should just be an optimization. People porting to new platforms should be able to safely implement it as a shim that just does

func malloc_size(_ requestedSize: Int) -> Int { return requestedSize }

That wouldn't work, though. Recall _swift_stdlib_malloc_size takes a pointer and returns the amount of memory allocated to it. malloc generally requires you to separately track the size of the hunk of memory you've requested be allocated. In ContiguousArrayBuffer, this isn't too bad, since the allocation and the capacity calculation are close together, but it certainly isn't in ManagedBufferPointer, hence the problem.

Oh, right, I was thinking of malloc_good_size rather than malloc_size since I'd been looking into using that instead. Yes, this is a bit awkward. We'd like to keep the optimization, but we should figure out how to make it conditional so people porting don't need to do it if it doesn't make sense on their platform.

4 Likes

Cool, sounds like I'm not completely barking up the wrong tree then.

_swift_stdlib_malloc_size() should probably be updated to take both pointer and tracked size as argument, so it will be able to simply returns the size on platform that don't support malloc_size, and it will force code writer to properly track allocated memory size, as it will be required to pass it.

If recomputing the tracked size is too expensive or impractical, maybe we could also add a boolean _stdlib_has_malloc_size stub and conditionalize all our uses of it on whether that returns true.