i’m gathering UTF-8 in a
[UInt8] buffer in a percent-encoding transformation. is there any point in doing
instead of the normal initializer? since percent-encoding is guaranteed to only make the
String 3x larger, would it make sense to preallocate 3x the size of the original
Percent-encoding isn't guaranteed to make the string 3x larger. That's the worst-case, when literally every byte in the source needs to be encoded.
This initializer is still useful for percent-encoding, but you need to calculate the length in advance. This is quite a typical thing with C APIs that perform character transcoding - often you do the transcoding twice, with the first pass using a
null buffer to just simulate transcoding and return the buffer size needed to hold the result.
There are potentially ways we can improve this. See: Amend/Augment SE-0322: Temporary Uninitialized Buffers to return nil if heap allocation is required
Otherwise, I'd recommend using WebURL's percent-encoding/decoding APIs (which can't easily be documented with DocC right now, because they extend stdlib types and protocols). It includes lazy encoding/decoding, which can be really useful when you have algorithms that can early-exit (e.g. searching for a particular string, where any of its bytes may be encoded). I spent a lot of time tuning them to ensure they give optimal code-gen, and they are benchmarked on a variety of hardware platforms (I noticed some big ARM vs Intel differences). You can define your own encode-sets easily, and it even supports form-encoding properly (I see so many attempts at implementing form-encoding that get it wrong).