I have an "embarrassingly parallel" problem: computing a function on every pixel. Some values are already computed; there is a special value to indicate notComputed. In the old C++ using pthreads code which I'm porting, the task was dividing into parts and with each thread accessing different scanlines, there was no locking or synchronization on the shared buffer. The buffer is already allocated with some values already computed. When the user resizes or scrolls a view, old values are preserved as new regions are revealed and need to be calculated.
Swift Structured Concurrency is so much easier to work with than the pthreads API, but the way. Many thanks to everyone involved.
I am brand new to using Swift Concurrency. My first attempt at parallelizing in Swift looks like this
@Sendable func compute(rows: ClosedRange<Int>, data: UnsafeMutableBufferPointer<Pixel>) async {
for y in rows {
for x in 0 ..< size.x where data[x + y * size.x] == notComputed {
let color = scaledColor(colorExpr.evaluate(on: grid, x, y))
if Task.isCancelled { return }
data[x + y * size.x] = color
}
}
let pixels = UnsafeMutableBufferPointer<Pixel>(start: &colorMap[0], count: colorMap.count)
await withTaskGroup(of: Void.self) { taskGroup in
for k in 0 ..< numProcessors {
let firstLine = firstRow + k * numScanLinesPerTask
let lastLine = min(firstLine + numScanLinesPerTask - 1, lastRow)
if lastLine >= firstLine { taskGroup.addTask { await compute(rows: firstLine ... lastLine, pixels: pixels) }}
}
}
That duplicates the behavior in the original code, and works as intended as far as I can tell.
The compiler doesn't allow use of colorMap
directly, as shared mutable state. Is subverting that with UnsafeMutableBufferPointer knowing that each task accesses disjoint rows merely poor style or is it incorrect in ways I don't understand yet? The compiler warns that Initialization of âUnsafeMutableBufferPointer<Pixel>â results in a dangling buffer pointer
. Is using that dangling pointer across parallel tasks merely working by accident at the moment?
Several questions:
- Is the approach above wrong in ways I don't understand?
1a. Is using the dangling point incorrect?
1b. Is sharing the dangling pointer across parallel tasks incorrect? - What would be best practice?
- Can it be done in a better style without introducing an extra copy into and out of each task?
- Can it be done without actor isolation of the shared mutable state?