I just noticed this post yesterday. Having worked with SwiftNIO's ByteBuffer and having tried to optimize mem usage in my packages, I have some opinions I want to share.
First of all, as I see things, the best case for a process is so when it owns "some-bytes" in its memory space, it should no longer copy any parts of that some-bytes. Optimally. So we should try to go for enabling no-copy usage as hard as practically possible IMO.
SwiftNIO does a good job at this by introducing readerIndex and writerIndex to the mix, while owning a pointer to the some-bytes. Basically a lowerbound and an upperbound of what the current instance should care about at all, compared to the bytes the ByteBuffer owns.
The good thing about this is that when you only care about only a part of this some-bytes, you can only modify the reader/writer index without needing to copy bytes around. Only some integer arithmetic which is wayy cheaper than memcpy or such.
You might think of "Why not just keep writerIndex, and always align the base pointer to be the start of the bytes?" to which I'd say that won't be the end of the world, but keeping the readerIndex is still worth it IMO so you can walk back the bytes if required. For example one of the things I have experience with is DNS, where the wire format allows receiving a pointer-value to a former position in the bytes you've received, which points to a domain name. So when you see you've received such a pointer-value, you need to temporarily walk back the bytes to find the domain name you're supposed to parse.
Furthermore, I think now that Swift has a fine non-copyable support, for these kinds of types it should default to a non-copyable type, and then provide wrapper types that makes a reference-counted object out of that non-copyable type or such.
Rust has a ARC type for this (as it does for other basic functionalities like CoW/dyn-Box etc...), but this is easier in Swift. Any kind of class will do the Box & ARC together. I'd say that class should also come with some CoW already, but I'll have think a bit deeper to decide for sure and to see if we should use separate types for CoW / ARC etc... . Possibly we might end up adding such types to the language itself just so not everybody has to manually introduce them in their library.
So 1 type + wrapper-types. A base non-copyable, minimum-overhead type, that can be wrapped into CoW/ARC etc... on demand.
We should also consider having a stack-or-heap bag-of-bytes type, usually named "Small<bag-of-bytes>" or "Tiny<bag-of-bytes>" for when there can be a low amount of bytes (16 or 24 usually) like how String works with pretty much always keeping 15 utf8 bytes inline (unless in rare cases, for example when it had to repair a sequence of utf8 bytes).
I have not thought everything through, but I think the Vision should mention how such a type fits into it. It can be pretty helpful if it's not possible to absolutely get to a place where we do absolutely 0 copy like what would be optimal. Because if we have to actually do copies, then such a "Small<bag-of-bytes>" can be clutch for performance.
Last thing is that we should also consider how such a type(s) interacts with parsing/serialization. Possibly not some special case types, but it should be clear what we're aiming for.
Such type(s) should be similar to the new ParserSpan in GitHub - apple/swift-binary-parsing · GitHub but it also needs to have critical differences.
It should allow not having to copy memory around when a sequence of the bytes in the bag are needed in the same form. Again, ByteBuffer allows this by moving indexes around. Such a parser type should also allow this.
IMO the way this might be best-implemented is to use an enum with a case of the raw non-copyable type and a CoW & / ARC type (or similar).
So in the parsing process, if there is a need to copy bytes, the enum can switch on-the-fly from something like a Span, to a CoW & / ARC type.
To take it one step further, I wish that we someday can reach a point to rewrite/modify the internals of some of the common (stdlib) types we currently have to allow sharing their internal storage as much as possible.
For example if you've received utf8-bytes over the wire, optimally there should be no need for any copies, to make an String out of the bytes.
The way I see this possibly happening is for String to use ARC & CoW ed wrapper types of the incoming bag-of-bytes type. Then when creating a new String from some utf8-bytes in the bag-of-bytes, you only need to tell String to view that part of the bag-of-bytes as its data.