Potentially non-atomic string concatenation — curious to understand what's happening

Hello! I've come across some strange behavior when appending strings leading to an out-of-bounds crash. I've figured out a solution/workaround, but I'm trying to understand why the problem happens.

I have the following code:

func startListening() {
    self.listening = true
    DispatchQueue.global(qos: .default).async {
        repeat {
            var maybeData: String = ""
            var lastRead: String? = nil
            repeat {
                let (isReadable, _) = try! self.socket.isReadableOrWritable(waitForever: false, timeout: 10)
                if isReadable {
                    lastRead = try! self.socket.readString()
                    if lastRead != nil {
                        maybeData += lastRead!
                    }
                } else {
                    lastRead = nil
                }
                // usleep: sleep for the specified number of microseconds.
                // This is needed to prevent a crash in the call to Self.messageBodies below.
                // It appears that the `maybeData += lastRead!` assignment is NOT atomic.
                // Without this sleep, `maybeData` in `messageBodies` may be an incomplete value.
                // In one run instance, the payload for `workspace/symbol` came in three chunks,
                // of 130656, 261312, and 339179 bytes. When it later crashed in `messageBodies`,
                // the length of `inputString` in that method was 130656 bytes.
                usleep(1000)
            } while lastRead != nil
            let bodies = Self.messageBodies(inputString: maybeData)
            for body in bodies {
                self.handle(messageBody: body)
            }
        } while self.listening
    }
}

This code polls for data from a TCP socket (connected to a process on the same computer), and keeps reading data until there's nothing left to read. The read data may contain multiple messages, so I split the data in Self.messageBodies by seeking ahead (e.g. let substring = [bodyStartIndex..<bodyEndIndex], where bodyEndIndex is determined from a Content-Length prefix in each message).

This was crashing with an out-of-bounds error. It happened maybe 10% of the time on an Intel Mac Pro, and 100% of the time on my M1 Max Macbook Pro. Through some trial-and-error, I figured out that the usleep(1000) in the middle of the code above consistently makes the crash go away, and removing that usleep(1000) makes it come back on the M1 Max. That's the only change I have to make to trigger or avoid the crash, so it seems like there's a timing-related problem here.

My guess is that maybeData += lastRead! isn't actually happening atomically/synchronously, and at the time that Self.messageBodies is called, maybeData is stale (i.e. the value it was on a previous iteration of the inner loop). As noted in the code comment, this guess comes from my observation that when it does crash for out-of-bounds, the length of inputString in the debugger is the same length as one of the values for lastRead during a prior iteration of the inner loop. These strings are hundreds of thousands of bytes long, in case that's significant.

I'm not sure if this guess is correct. I'm curious if anyone has a definitive idea of what the problem is, why it happens all the time on Apple Silicon but only sometimes on Intel, or an explanation of why adding the usleep(1000) fixes it.

1 Like

There's no Swift concurrency here, and there's really no GCD asynchronicity either, since (unless I've missed something that's staring me in the face) all of the code in the outer repeat is 100% synchronous and therefore executed sequentially.

Given the numbers in your comments, the only logical explanation I can see would be that self.socket.readString() returned the following series of return values for that payload:

  • String with length 130656.
  • nil (perhaps a timeout or some other transient error?)
  • String with length 261312.
  • String with length 339179
  • nil

That would likely fail at a bounds check if Self.messageBodies does not carefully check that the length of its input is valid according to the internal structure of messages.

The only other possible question I have concerns self.handle(messageBody:). If Self.messageBodies in fact returns an array of Substrings instead of an array of Strings, and if self.handle allows message bodies to escape to another thread or queue, you might end up with a data race on the String to which the Substring is related.

Either way, some judicious logging of the exact sequence of events should give you insight into what's gone wrong.

P.S. Welcome to the Swift forums!

1 Like

I believe I know the problem you are addressing.

The readString method of the IBM BlueSocket library just reads the bytes on the socket hoping that the network is fast enough and that all the bytes of a incoming string are contained in a single read. In reality the incoming string may be spread in multiple chunks and must be recovered using multiple reads up to a string termination.

Here is an extension of the Socket class with a readString method which reads continuously and accumulates in a string until termination is detected.

extension Socket {
    ///
    /// Read a string from the socket with optional terminator
    /// Example for reading a string terminated by CRLF:
    ///     try readString(terminator: "\r\n")
    /// - Returns: String containing the data read from the socket
    ///
    public func readString(terminator: String = "", timeout: UInt = 0) throws -> String? {
        //var counter = 0
        if terminator == "" {
            return try readString()
        } else {
            var string = ""
            repeat {
                let result = try Socket.wait(for: [self], timeout: timeout)
                if result == nil {
                    // timeout
                    break
                }
                else {
                    if let str = try readString() {
                        string += str
                        if str.hasSuffix(terminator + "\n") {
                            break
                        }
                    }
                }
            } while true
            return string
        }
    }
}

And following a small client demo.

import Foundation
import Socket

print("Hello GreenSocket !")

do {
    let socket = try Socket.create(family: .inet, type: .stream, proto: .tcp)
    let serverAddress = "127.0.0.1"  // localhost
    let serverPort = 1337

    try socket.listen(on: Int(serverPort))
    try socket.connect(to: serverAddress, port: Int32(serverPort))

    var hello = String(repeating: "Hello from client...", count: 1000)
    let terminator = "THE END"
    hello += terminator
    
    var data = Data()
    let result = try socket.read(into: &data)

    print("bytesRead = \(result)")
    if result > 0 {
        let messageString = String(decoding: data, as: UTF8.self)
        print(messageString)
    }
    
    try socket.write(from: hello)
    
    data.removeAll()
    
    let response = try socket.readString(terminator: "THE END", timeout: 5000)
    
    if let response = response {
        print(response)
    }
    
    socket.close()

}
catch {
    print("Socket error: \(error)")
}

First start the echoe server demo, and on the same machine start the client demo.

I made a BlueSocket library fotk and added support for Windows. Here is my work hope it helps you.

Thanks @QuinceyMorris! I agree there's no concurrency involved here … self.handle eventually does things on the main queue, but that's after the crash in Self.messageBodies.

I've tried adding some logging (and taking out the usleeps to reproduce the crash):

func startListening() {
    self.listening = true
    DispatchQueue.global(qos: .default).async {
        repeat {
            var maybeData: String = ""
            var lastRead: String? = nil
            repeat {
                let (isReadable, _) = try! self.socket.isReadableOrWritable(waitForever: false, timeout: 10)
                if isReadable {
                    lastRead = try! self.socket.readString()
                    if lastRead != nil {
                        print("Read length \(lastRead!.lengthOfBytes(using: .utf8))")
                        maybeData += lastRead!
                    } else {
                        print("Read nil")
                    }
                } else {
                    print("Not readable")
                    lastRead = nil
                }
            } while lastRead != nil
            let bodies = Self.messageBodies(fromSocketString: maybeData)
            for body in bodies {
                self.handle(messageBody: body)
            }
        } while self.listening
    }
}

The log output I get is a bunch of "Not readable" (as expected), and, interspersed throughout:

// Some smaller payloads that I'm expecting; never seen a problem with these
Read length 701
Read length 232
Read length 1452
Read length 208
Read length 208
Read length 209
Read length 172
// The big payload
Read length 130656
Read length 253536
Read length 346955

"Read nil" never shows up.

It is interesting (and a bit unexpected to me) that the chunked lengths are repeatably the same every time I run this code, especially since the lengths don't seem to be any recognizably-obvious threshold.

Re:

If Self.messageBodies in fact returns an array of Substring s instead of an array of Strings

It does not; Self.messageBodies calls the String constructor on each substring slice that it finds :smiley:


@Datagram yes that is the problem! I don't think I can use your specific extension though — I'm reading messages following the LSP spec, which, as far as I can tell, don't have a specific terminator. Instead, they're only delineated by the Content-Length header. If I remember right, I've seen socket payloads where multiple messages are concatenated right after one another without even a newline in between.

For this LSP simple base protocol I suggest to start with my GreenSocket demo example named SimpleTCPMessageDemo. Its based on an additional Socket read method with the capability to read exactly n-bytes.

GreenSocket additional method for reading data from a socket (TCP/UNIX).

GreenSocket provides the additional read method.

  • read(into data: inout Data, length: Int, timeout: UInt) - This function reads exactly length data bytes on a socket and returns it in the Data object that was passed. Optional timeout is in milliseconds. This read method help building message layer where we are waiting to read exactly n-bytes. The method may throws on socket errors in addition to specialized ReadLengthError error type. This allows easy handling of timeout or remote closed connection errors. See the SimpleTCPMessageDemo.

The SimpleTCPMessageDemo show how to use it to send and receive a simple message with a header (the message length) and a payload (a string in your case).

SimpleTCPMessageDemo source extract:

//
// This extension add send / receive message methods - implementing a rudimentary message layer
// The message protocol has two parts: a 16-bit header followed by n-bytes payload
// The 16-bit header value represent the payload size in bytes
//
extension Socket {
    
    func sendMessage(data payload: Data) throws -> Int {
        let header  = UInt16(payload.count)
        let headerData = withUnsafeBytes(of: header.bigEndian) { bytes in Data(bytes) }
        return try self.write(from: headerData + payload)
    }
    
    func recvMessage(into data: inout Data, timeout: UInt) throws -> Int {
        // Read the header
        var headerData = Data()
        let headerCount = try self.read(into: &headerData, length: MemoryLayout<UInt16>.size, timeout: timeout)
        if headerCount > 0 {
            // Then read the payload
            // Compute the payload length from the header value
            let payloadLength = headerData.withUnsafeBytes { bytes in bytes.load(as: UInt16.self) }.bigEndian
            print("recvMessage: Received header: \(headerCount) bytes, value: \(payloadLength)")
            
            // Read the remaining payloadLength bytes
            print("recvMessage: Read remaining \(payloadLength) bytes...")
            return try self.read(into: &data, length: Int(payloadLength), timeout: timeout)
        }
        return 0
    }
}

I think this fit the LSP header/string simple base protocol requirement and can be used as a starting point for your use case.