I'm currently implementing a custom data structure based on two buffers, where each holds the characters left and right to the cursor respectively. Sure I could just use [Character] arrays instead. But since I also know C I wanted to learn how to manually manage memory in Swift, and hopefully squeeze out some extra performance.
My current approach is to store two UnsafeMutableBufferPointer<Character> inside a non copyable struct. Memory allocation would happen in the initializer, and deallocation in the deinitializer.
However, I am not sure how to handle the (de)initialization of Characters inside the buffers.
Since Character is a copyable value type which (funnily) is backed by a String which itself may reference memory managed by ARC, can blindly copying around Character instances to manually allocated buffers bypass ARC and cause memory leaks?
Is it safe to just write buffer[i] = character, or do I need to explicitly call initialize(to:) and deinitialize(count:) when assigning or replacing elements?
If manual (de)initialization is necessary, what’s the correct way to manage a buffer of Characters, especially when replacing or removing elements, or using them as a parameter / return value of a function
I would be grateful if anyone could help me understand proper memory management in Swift. Things seem a little less "easy" than in C (although its also easier there to mess up badly :P).
You should definitely just use an Array if you can. But, since you said this is a thought exercise for memory management...
buffer[i] = character is fine if the elements in buffer have already been initialized, because setting an element via the subscript is equivalent to deinitializing what is currently there (releasing it) and then initializing the element in the buffer with the new value (copying it in).
If the elements in buffer haven't been initialized yet, then it would be an error to use the subscript because it would attempt to release an element that wasn't initialized. The documentation for subscript mentions this:
Uninitialized memory cannot be initialized to a nontrivial type using this subscript. Instead, use an initializing method, such as initializeElement(at:to:).
In general, if you're managing a manual buffer of Characters (or any non-trivial/non-POD value), you would need to either
Initialize every element in the buffer to some default value when you allocate it and ensure that elements are always initialized, or
Leave unused elements uninitialized, and track in some sidecar storage which elements are initialized and which aren't.
Because, before you deallocate the buffer, you need to manually deinitialize all the elements in it, either by calling deinitialize() (this assumes they're all initialized), or looping through and calling deinitializeElement(at:) (which would let you limit yourself to only the elements that are initialized, assuming you're tracking it separately).
As far as passing the values around or returning them, you shouldn't need to worry about it once you're talking about a Character value instead of something inside the buffer. Reading an element from the buffer gives you a notional copy of it that you can work with independently, and writing an element into the buffer copies it in so that the original value going out of scope doesn't impact the one in the buffer.
Do you need O(1) random access into either buffer? If not, it would be much simpler to allocate a pair of UInt8 buffers and store UTF8 encoded text there instead.
@Slava_Pestov The problem with UTF8 is decoding and traversing it. While doing that is relatively easy for single codepoints, I have yet to learn about how to deal with characters consisting of more than one. I'll think about it though.
Thanks for the article about gap buffers though! Their resizing doesn't seem too difficult to implement so I may switch over to one. Might result in better cache locality or something :P