There is some code such as "swap(&immediateTasksCopy, &self._immediateTasks)" && "swap(&decoder, &self.decoder)."
What are the benefits of using swaps over assignment? How did you know to use this over assignment, is there some all beholding book?
There is some code such as "swap(&immediateTasksCopy, &self._immediateTasks)" && "swap(&decoder, &self.decoder)."
What are the benefits of using swaps over assignment? How did you know to use this over assignment, is there some all beholding book?
Each of these cases uses swap
for a different reason. It's easiest to explain these by looking at the specific cases.
First, let's consider swap(&immediateTasksCopy, &self._immediateTasks)
in its wider context:
defer {
var iterations = 0
var drained = false
var scheduledTasksCopy = ContiguousArray<ScheduledTask>()
var immediateTasksCopy = Deque<UnderlyingTask>()
repeat { // We may need to do multiple rounds of this because failing tasks may lead to more work.
self._tasksLock.withLock {
// In this state we never want the selector to be woken again, so we pretend we're permanently running.
self._pendingTaskPop = true
// reserve the correct capacity so we don't need to realloc later on.
scheduledTasksCopy.reserveCapacity(self._scheduledTasks.count)
while let sched = self._scheduledTasks.pop() {
scheduledTasksCopy.append(sched)
}
swap(&immediateTasksCopy, &self._immediateTasks)
}
// Run all the immediate tasks. They're all "expired" and don't have failFn,
// therefore the best course of action is to run them.
for task in immediateTasksCopy {
self.run(task)
}
// Fail all the scheduled tasks.
for task in scheduledTasksCopy {
task.fail(EventLoopError.shutdown)
}
iterations += 1
drained = immediateTasksCopy.count == 0 && scheduledTasksCopy.count == 0
immediateTasksCopy.removeAll(keepingCapacity: true)
scheduledTasksCopy.removeAll(keepingCapacity: true)
} while !drained && iterations < 1000
precondition(drained, "EventLoop \(self) didn't quiesce after 1000 ticks.")
assert(self.internalState == .noLongerRunning, "illegal state: \(self.internalState)")
self.internalState = .exitingThread
}
In this context, we are in a defer
block inside our event loop run function. This means we're doing cleanup: we're stopping the EL. The swap
occurs in a loop over our pending tasks, where we are pulling items out of our immediateTasks
and our scheduledTasks
.
The loop exists because our cleanup of these tasks may cause the user code to enqueue more tasks. We want to ensure we don't leak them, so we need to keep handling them until no further tasks exist.
Additionally, the two task lists are protected by a lock. We don't want to repeatedly take-and-drop that lock in order to pop from the tasks, so we take it and copy the elements out.
Our goal here is to minimise how costly this is, so we want to avoid allocating new task arrays all the time. If all we did was a literal shallow copy (let immediateTasksCopy = self._immediateTasks
), any attempt to enqueue a new task would cause another heap allocation.
Instead, we create a single new tasks array, and swap
it with the regular one. This is essentially free, and ensures that at each time there is only one reference to either the copy or the original. When we're done with the copy, we clear it out (keepingCapacity
), and then swap it back with the original. This is the cheapest possible way to achieve this goal.
Why don't we do this with scheduledTasksCopy
? Frankly, we probably should!
Next, let's look at swap(&decoder, &self.decoder)
. Again, in context:
var possiblyReclaimBytes = false
var decoder: Decoder? = nil
swap(&decoder, &self.decoder)
assert(decoder != nil) // self.decoder only `nil` if we're being re-entered, but .available means we're not
defer {
swap(&decoder, &self.decoder)
if buffer.readableBytes > 0 && possiblyReclaimBytes {
// we asserted above that the decoder we just swapped back in was non-nil so now `self.decoder` must
// be non-nil.
if self.decoder!.shouldReclaimBytes(buffer: buffer) {
buffer.discardReadBytes()
}
}
self.buffer.finishProcessing(remainder: &buffer)
}
let decodeResult = try body(&decoder!, &buffer)
// If we .continue, there's no point in trying to reclaim bytes because we'll loop again. If we need more
// data on the other hand, we should try to reclaim some of those bytes.
possiblyReclaimBytes = decodeResult == .needMoreData
return .didProcess(decodeResult)
The goal here is to defend against re-entrant code. It is possible for us to end up calling into this function recursively. That's a problem, as the law of exclusivity will forbid us from touching the decoder again. To make that happen, we need to nil it out. The easiest way for us to do that is to swap a nil value into the existing space, and pull the current value to a temporary.
Thanks for the explanation. Honoured to get such an in-depth answer. Learnt a lot from this.
Good to learn more about the "Law of Exclusivity" and what "Re-entrancy" is.
I want to discuss this part a bit more. How is it a swap "essentially free?"
A swap
produces no long-lived temporary values, so the refcounts are stable before and after the operation. The operation itself also logically doesn't change the refcount of either value: it just replaces the bytes stored in the first location with those in the second location, and vice-versa.
As an example, consider the optimized compilation of the swap
of two arrays:
func rearrange(_ first: inout [Int], _ second: inout [Int]) {
swap(&first, &second)
}
generates the following x86 assembly:
output.rearrange(inout [Swift.Int], inout [Swift.Int]) -> ():
mov rax, qword ptr [rdi]
mov rcx, qword ptr [rsi]
mov qword ptr [rdi], rcx
mov qword ptr [rsi], rax
ret
Here we have no refcount operations and no allocations. All we do is copy to two registers, then write the registers back to the opposite locations. This is as cheap as it gets, and given that the values are going to be in cache, we can consider it to be essentially free.
This is quite different to actually creating a temporary, which must emit at least a refcount operation.
Beautiful. Thank you for this wisdom. And this compiler link!
As others noted, swap
is usually faster than anything involving temporary variables because the compiler treats it as "atomic" from an ARC perspective, and thus avoids inserting pointless retains and releases (among a few other optimisations).
Note though that swap
isn't always the fastest way to swap two values. Sometimes it's faster to do:
(a, b) = (b, a)
Apparently the compiler recognises this specially and knows how to avoid unnecessary work (like retain-release activity) even for complex types. This method may also be more broadly applicable as complex ownership semantics develop in the Swift language.
From what I've seen so far the difference is small, so not something I suggest you worry about, generally. But if you do see swap
taking a non-trivial amount of time in a hot path, you can try the tuple swap method instead and see if it happens to help in that specific situation.
You also get the same result with an exchange function using the new ownership features.
func exchange<T: ~Copyable>(_ lhs: inout T, _ newValue: consuming T) -> T {
let old = consume lhs
lhs = newValue
return old
}
func rearrange(_ first: inout [Int], _ second: inout [Int]) {
second = exchange(&first, consume second)
}
Produces the same machine code as if you had done (first, second) = (second, first)
, and as a slight bonus produces no retains on -Onone
, whereas the (first, second) = (second, first)
approach has two retain/release pairs on -Onone
. As you can see here.
How about this?
func exchange<T: ~Copyable>(_ lhs: inout T, _ rhs: inout T) {
let old = consume lhs
lhs = consume rhs
rhs = old
}
Well, that would just be swap
. But yes, I do believe that is a valid implementation.
Also, I only just thought of it, but since T
is ~Copyable
, you probably don't need the explicit consumes
at all.
func swap<T: ~Copyable>(_ lhs: inout T, _ rhs: inout T) {
let old = lhs
lhs = rhs
rhs = old
}
Should be sufficient.