Parsing pascal extended ASCII string using `BinaryParsing` library

What is recomended way to parse string when layout is [length:16 bit][char: 8 bit x length]?

Something like this should work (assuming you have a raw buffer pointer to your Pascal string):

func parseString(from buffer: UnsafeRawBufferPointer) -> String? {
    guard buffer.count >= 2 else { return nil }

    let length = Int(buffer.load(as: UInt16.self))
    let end = length + 2

    guard buffer.count >= end else { return nil }

    return String(decoding: buffer[2..<end], as: UTF8.self)
}

This assumes that the buffer stores length in host endian order. If you need to convert, you could do something like:

    let length = Int(buffer.load(as: UInt16.self).bigEndian)

[EDIT: add endianness]

1 Like

Sorry, I should be more clear. I'm asking how to do it properly using BinaryParsing package specifically.

For that layout, you'd want something like this:

extension String {
   init(parsingPascalExtended input: inout ParsingSpan) throws {
      let count = try Int(parsing: &input, storedAsBigEndian: UInt16.self)
      self = try String(parsingUTF8: &input, count: count)
   }
}

That assumes the count is stored as an unsigned, big-endian 16-bit integer – you can see all the different integer parsers here: Documentation

Does that work for you?

1 Like

I think this method should work for strings that use basic ASCII set. In my case strings could use different encodings like windows-1251 or windows-1252.

I tried to copy String extensions from the package and adjust them but these use internal methods, so I forked it Add parser for extended ASCII string · HeMet/swift-binary-parsing@4befa15 · GitHub

1 Like