Hello! I've come across some strange behavior when appending strings leading to an out-of-bounds crash. I've figured out a solution/workaround, but I'm trying to understand why the problem happens.
I have the following code:
func startListening() {
self.listening = true
DispatchQueue.global(qos: .default).async {
repeat {
var maybeData: String = ""
var lastRead: String? = nil
repeat {
let (isReadable, _) = try! self.socket.isReadableOrWritable(waitForever: false, timeout: 10)
if isReadable {
lastRead = try! self.socket.readString()
if lastRead != nil {
maybeData += lastRead!
}
} else {
lastRead = nil
}
// usleep: sleep for the specified number of microseconds.
// This is needed to prevent a crash in the call to Self.messageBodies below.
// It appears that the `maybeData += lastRead!` assignment is NOT atomic.
// Without this sleep, `maybeData` in `messageBodies` may be an incomplete value.
// In one run instance, the payload for `workspace/symbol` came in three chunks,
// of 130656, 261312, and 339179 bytes. When it later crashed in `messageBodies`,
// the length of `inputString` in that method was 130656 bytes.
usleep(1000)
} while lastRead != nil
let bodies = Self.messageBodies(inputString: maybeData)
for body in bodies {
self.handle(messageBody: body)
}
} while self.listening
}
}
This code polls for data from a TCP socket (connected to a process on the same computer), and keeps reading data until there's nothing left to read. The read data may contain multiple messages, so I split the data in Self.messageBodies
by seeking ahead (e.g. let substring = [bodyStartIndex..<bodyEndIndex]
, where bodyEndIndex
is determined from a Content-Length
prefix in each message).
This was crashing with an out-of-bounds error. It happened maybe 10% of the time on an Intel Mac Pro, and 100% of the time on my M1 Max Macbook Pro. Through some trial-and-error, I figured out that the usleep(1000)
in the middle of the code above consistently makes the crash go away, and removing that usleep(1000)
makes it come back on the M1 Max. That's the only change I have to make to trigger or avoid the crash, so it seems like there's a timing-related problem here.
My guess is that maybeData += lastRead!
isn't actually happening atomically/synchronously, and at the time that Self.messageBodies
is called, maybeData
is stale (i.e. the value it was on a previous iteration of the inner loop). As noted in the code comment, this guess comes from my observation that when it does crash for out-of-bounds, the length of inputString
in the debugger is the same length as one of the values for lastRead
during a prior iteration of the inner loop. These strings are hundreds of thousands of bytes long, in case that's significant.
I'm not sure if this guess is correct. I'm curious if anyone has a definitive idea of what the problem is, why it happens all the time on Apple Silicon but only sometimes on Intel, or an explanation of why adding the usleep(1000)
fixes it.