This BTW is always puzzling me: conceptually a buffer contents is not any different than the two variables for reading and writing position. Let's say the buffer is tiny – only two bytes – and lets represent those two bytes as two extra variables, A and B. We'd have four variables in total:
R // read position, Int
W // write position, Int
A // first byte of buffer, Byte
B // second byte of buffer, Byte
The typical approach is to access R & W atomically with acquire/release memory order or the combination of these flags. However, when reading/writing the buffer contents no atomic operations or locks are used, just plain memcpy (which could be implemented by a manual loop) and in this particular extreme example accessing the A & B variables.
Oversimplifying, assuming a single reader and a single writer [1], and assuming reads / writes are of size 1:
// writer:
1. grab W non-atomically (there's just one writer – us)
2. grab R atomically
3. based on the two obtained values, if there's space available to be written †:
a) write to A or B (depending upon the obtained W position)
b) increment W atomically
† - note that there is no time of check vs time of use issue here as there are no other writers, just the code we are in itself, so the space available could not reduce.
// reader:
4. grab R non-atomically (there's just one reader – us)
5. grab W atomically
6. based on the two obtained values, if there's anything to read ††:
a) read A or B (depending upon the obtained R position)
b) increment R atomically
†† - note that there is no time of check vs time of use issue here as there are no other readers, just the code we are in itself, so the contents available to read could not reduce.
The puzzling moment is that in the reader / writer pseudo code above #3a and #6a are done non-atomically. This could only work reliably if #3b is syncing not just its operand – the W variable itself – but everything currently out of sync. Is this guaranteed to happen or are there weak architectures on which this isn't done?
[1] - @taylorswift use case would require support for more than one writer and a single reader so the pseudocode outlined is not be sufficient. Note that it is crucial to not get yourself into the situation:
// WRONG, DON'T DO THAT:
let oldPos = atomicAdd(&writePos, size)
write to the buffer based on oldPos
as the reader might get control after writePos has been updated but before the contents is written fully or at all. In case of multiple writers I'd probably protect writerPosition by a lock:
// not ideal (as there's an O(n) operation under lock) but correct:
lock.withLock {
write to the buffer based on writerPos with e.g. `memcpy`
(note that you'd need one or two memcpy's to handle buffer wrapping.)
writerPos += size
}
reader code (in case of single reader) could still use atomic, but I'd probably not bother and make the code more consistent by using the lock for the reader as well. Keep the code under lock as small as possible (e.g. in case of reader memcpy it to a temporary buffer †††, obviously it would be a very bad idea to perform the actual writing being under lock).
††† - bonus point of using a temporary (scratch) buffer on the reader side – you'd be able doing a single "write to file" operation even when the data you are writing is located at the end of the ring buffer and wraps to its beginning. Without a temporary buffer you'd need performing two I/O operations.
PS. the actual code that implements RingBuffer takes about 50 lines... much shorter than this post