I am getting an EXC_BAD_ACCESS when using withTaskGroup in the following way, depending on how big the structure being built is. I would be thankful for any idea on how to solve this.
(The demo project is on github in the case anyone would like to try it, the path packagePath in the Swift file has to be adjusted. When started in debug mode from within Xcode, it shows the error; for a release build running in terminal one gets zsh: bus error ./BadAccessDemo2.)
Thanks.
import Foundation
import SwiftXMLC
@main
struct Test {
static func main() async throws {
// !!! adjust path before running: !!!
let packagePath = "/Users/stefan/Projekte/BadAccessDemo2"
let paths = [
// small example:
"\(packagePath)/test1.xml",
// same structure, but a little bigger:
"\(packagePath)/test2.xml",
]
await paths.forEachAsyncThrowing { path in // (forEachAsyncThrowing defined below; same result without it)
// OK in both cases:
await inner(path: path, i: 1)
if #available(macOS 10.15, *) {
await withTaskGroup(of: Void.self) { group in
func outer() async {
group.addTask {
// OK for smaller example, EXC_BAD_ACCESS for the larger example:
await inner(path: path, i: 2)
}
}
await outer()
for await _ in group {}
}
} else {
print("wrong OS version")
}
}
}
}
func inner(path: String, i: Int) async {
let document = XDocument()
do {
let data = try Data(contentsOf: URL(fileURLWithPath: path))
do {
// building a structure:
try XParser().parse(fromData: data, eventHandlers: [XParseBuilder(document: document)])
// writing it back to another file as a test:
let copyPath = "\(path).copy\(i).xml"
document.write(toFile: copyPath)
print("\(copyPath) written")
print("press RETURN to continue..."); _ = readLine()
}
catch {
print(error.localizedDescription)
}
}
catch {
print(error.localizedDescription)
}
}
extension Sequence {
func forEachAsyncThrowing (
_ operation: (Element) async throws -> Void
) async rethrows {
for element in self {
try await operation(element)
}
}
}
Yes, thanks. That should allow at least some investigation without necessarily having to run the example app. What version of Xcode are you running? (I'd also suggest updating your OS.)
Hey I think I know what the problem is on this one,
extension Sequence {
func forEachAsyncThrowing (
_ operation: (Element) async throws -> Void
) async rethrows {
for element in self {
try await operation(element)
}
}
}
since self is referenced inside of the sequence in for element in self and you have await withTaskGroup(of: Void.self) { group in every time from how I understand it is that every time a task is created or nested inside of another task and that task is inside another task it doesn't know which thread to go back to, I would test out using unowned or weak self
I wonder if it is a valid way to asyncly call group.addTask, since the doc says:
Donโt use a task group from outside the task where you created it. In most cases, the Swift type system prevents a task group from escaping like that because adding a child task to a task group is a mutating operation, and mutation operations canโt be performed from a concurrent execution context like a child task.
OK, then I do not know how to control the parallelism (how many "work items" are created at once). I need to do this because each work item can be large.
But I have a problem much worse: After changing some code in my real application (not the demo app), I also get the EXC_BAD_ACCESS when not using withTaskGroup, but only doing something that corresponds to the await inner(path: path, i: 1) call, but outside withTaskGroup, something that I could not reproduce with my demo project. It seems to have something to do with async/await and too complicated call chains + big data structures. It is really unsatisfactory because everything is working fine with smaller data structures and/or less complicated call chains, the data structures and the call chains being of the same kind in both case. So I have some nested async/await calls inside my main() async throws, and every time I change some subtle little thing I then may get this crash or not, as soon as the data gets too large. I really do not see anything I am doing that could not be OK (besides maybe what you said, but I am getting the error now even without withTaskGroup as I said). Somehow async/await seems either be a very difficult thing where you really, really need to use it "just the right way", or this feature and/or Swift version 5.5 is "unstable" / has uncorrect behaviour. I am thinking about removing every async/await completely from my application again and wait until this async/await thing gets more stable (if it is indeed "unstable"). This would be a pity, because it seemed to be an easy to use mechanism and I really would like be able to process my work items asynchronously.
Without having actually run the sample (I'm not with my macOS 12 machine atm), the crash log makes me somewhat suspicious of a stack overflow, and looking at the sample data and SwiftXMLC source, these two stored variables in particular look like potential suspects.
Have you run your sample in a debugger and examined the full stack trace after the EXC_BAD_ACCESS?
Well, the structure gets big, but not too big when not using async/await, even for much bigger files, never getting any error. So I know the data structure gets big, the point is that just when adding async/await ones gets the error. How can this be, or should one not use async/await with such large data structures? And the data structure is OK even in the case with async/await when the error occurs, as it is written to a file again and just returning from that function the error occurs.
The error says (see above):
VM Region Info: 0x16fcdfff0 is in 0x16fcdc000-0x16fce0000; bytes after start: 16368 bytes before end: 15
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
Stack 16f4e0000-16fcdc000 [ 8176K] rw-/rwx SM=PRV thread 0
---> STACK GUARD 16fcdc000-16fce0000 [ 16K] ---/rwx SM=NUL ... for thread 1
Stack 16fce0000-16fd68000 [ 544K] rw-/rwx SM=PRV thread 1
I am currently replacing async/await with code using semaphores.
Yes. As you can see from even the abbreviated stack trace the error occurs while XElements are being deallocated. To elaborate a bit on what I wrote above, I suspect that the variables I linked lead to the synthesized deinits calling each other recursively, which means that call depth would be dictated directly by the length of the chain of elements with the same name.
I'd hazard a guess that the threads the default executor uses might have different stack size configurations, or maybe having a few more calls above your code is enough to make a difference.
Either way, I really recommend you to look at the full stack trace, as then you will know for certain whether recursion/stack depth are actually the problem.
If they are, async/await merely shone a light on an architecture problem that might have surfaced in a number of other ways (different stack size configurations depending on OS/device/...), and would also have a decent impact on performance of deallocating your models, even if it doesn't crash.
...Just one more thought: I am not doing any deinits, why should there be some functions calling each other if it should just be a matter of deallocating objects? Does this always occur with a chain of weakly referenced objects that you could get a stack overflow from that?
You are not implementing a deinit, but the compiler synthesizes one that does the necessary ARC housekeeping. It would release the next node, which in turn would call that nodes deinit, which releases the next next one and so on. As far as I understand it, any linked list, weak or not, would run into this given sufficient length.
For reference, here you can see a deinit of a linked list node, written to avoid the same problem I suspect is at fault in your example. I wouldn't put too much trust in the linked list implementation itself, but the strategy used by the deinit should apply.
(Edit: I'll have a look at your sample when I'm back with my macOS 12 machine as well.)