raw buffer pointer load alignment

According to the docs
<https://developer.apple.com/documentation/swift/unsaferawbufferpointer/2632297-load&gt;,
the load(fromByteOffset:as:) method requires the instance to be “properly
aligned” Does that mean if I have raw data meant to be interpreted as

0 1 2 3 4 5 6
[ Int8 | Int16 | Int16 | Int8 | Int8 ]

i can’t just load the Int16 from byte offset 1?

Hi Kelvin,

According to the docs, the load(fromByteOffset:as:) method requires the instance to be “properly aligned” Does that mean if I have raw data meant to be interpreted as

0 1 2 3 4 5 6
[ Int8 | Int16 | Int16 | Int8 | Int8 ]

i can’t just load the Int16 from byte offset 1?

you can't just dereference that pointer as an Int16 (in any language) without causing UB but it's not really an issue, just do this (assuming `ptr` is the pointer into your data and `index` is the index to where your Int16 lives):

    var value: Int16 = 0
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: ptr.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<Int16>.size))
    }

you can have the whole thing generic too for <T: FixedWidthInteger>

    var value: T = 0
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: ptr.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<T>.size))
    }

-- Johannes

···

On 8 Nov 2017, at 4:54 pm, Kelvin Ma via swift-users <swift-users@swift.org> wrote:

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

yikes there’s no less verbose way to do that? and if the type isn’t an
integer there’s no way to avoid the default initialization? Can this be
done with opaques or something?

···

On Wed, Nov 8, 2017 at 7:04 PM, Johannes Weiß <johannesweiss@apple.com> wrote:

Hi Kelvin,

> On 8 Nov 2017, at 4:54 pm, Kelvin Ma via swift-users < > swift-users@swift.org> wrote:
>
> According to the docs, the load(fromByteOffset:as:) method requires the
instance to be “properly aligned” Does that mean if I have raw data meant
to be interpreted as
>
> 0 1 2 3 4 5 6
> [ Int8 | Int16 | Int16 | Int8 | Int8 ]
>
>
> i can’t just load the Int16 from byte offset 1?

you can't just dereference that pointer as an Int16 (in any language)
without causing UB but it's not really an issue, just do this (assuming
`ptr` is the pointer into your data and `index` is the index to where your
Int16 lives):

    var value: Int16 = 0
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start:
ptr.baseAddress!.advanced(by: index),
                                                        count:
MemoryLayout<Int16>.size))
    }

you can have the whole thing generic too for <T: FixedWidthInteger>

    var value: T = 0
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start:
ptr.baseAddress!.advanced(by: index),
                                                        count:
MemoryLayout<T>.size))
    }

-- Johannes

> _______________________________________________
> swift-users mailing list
> swift-users@swift.org
> https://lists.swift.org/mailman/listinfo/swift-users

Hi Kelvin,

yikes there’s no less verbose way to do that? and if the type isn’t an integer there’s no way to avoid the default initialization? Can this be done with opaques or something?

well, it's 5 lines for the generic case to rule all the integers. You could just put that in a function and never think about it again, right?

func integerFromBuffer<T: FixedWidthInteger>(_ pointer: UnsafeRawBufferPointer, index: Int) -> T {
    precondition(index >= 0)
    precondition(index <= pointer.count - MemoryLayout<T>.size)

    var value = T()
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: pointer.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<T>.size))
    }
    return value
}

should work (untested). Also you might need to handle endianness.

Regarding types that are not integers, what types are you thinking of? For normal Swift types the layout isn't guaranteed so you can't just read the bytes from somewhere. For C types where the layout is known you can just use the above code and either relax the constraint or specialise it to the very type you need. The only requirement of the type (besides that it's layout is defined) is that is has an empty constructor.

-- Johannes

···

On 8 Nov 2017, at 5:40 pm, Kelvin Ma <kelvin13ma@gmail.com> wrote:

On Wed, Nov 8, 2017 at 7:04 PM, Johannes Weiß <johannesweiss@apple.com> wrote:
Hi Kelvin,

> On 8 Nov 2017, at 4:54 pm, Kelvin Ma via swift-users <swift-users@swift.org> wrote:
>
> According to the docs, the load(fromByteOffset:as:) method requires the instance to be “properly aligned” Does that mean if I have raw data meant to be interpreted as
>
> 0 1 2 3 4 5 6
> [ Int8 | Int16 | Int16 | Int8 | Int8 ]
>
>
> i can’t just load the Int16 from byte offset 1?

you can't just dereference that pointer as an Int16 (in any language) without causing UB but it's not really an issue, just do this (assuming `ptr` is the pointer into your data and `index` is the index to where your Int16 lives):

    var value: Int16 = 0
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: ptr.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<Int16>.size))
    }

you can have the whole thing generic too for <T: FixedWidthInteger>

    var value: T = 0
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: ptr.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<T>.size))
    }

-- Johannes

> _______________________________________________
> swift-users mailing list
> swift-users@swift.org
> https://lists.swift.org/mailman/listinfo/swift-users

For context, the problem I’m trying to solve is efficiently parsing JPEG
chunks. This means reading each chunk of the JPEG from a file into a raw
buffer pointer, and then parsing the chunk according to its expected
layout. For example, a frame header chunk looks like this:

0 1 2 3
4 5 6
[ precision:UInt8 | height:UInt16 |
width:UInt16 | Nf:UInt8 | ... ]

what I want to do is to be able to load the height and width into something
I can pass into UInt16.init(bigEndian:) without failing because of
alignment. I’ve thought of several options but none of them seem to be
great.

1 - bind the entire buffer to UInt8.self, and then do buffer[1] <<
UInt8.bitWidth | buffer[2]. probably most straightforward, but doesn’t
generalize well at all to larger Int types.

2 - copy MemoryLayout<UInt16>.size bytes from offset 1 into the beginning
of a new raw buffer, aligned to MemoryLayout<UInt16>.alignment, and do
load(fromByteOffset:
0, as: UInt16.self) from *that*. Seems very inefficient because you have to
allocate a new heap buffer copy everything over and then free it just to
hold the bytes in the right alignment.

3 - use withUnsafeMutablePointer(to:_:) on a local variable of type UInt16,
cast it to a raw pointer, and copy MemoryLayout<UInt16>.size bytes into it.
Like 2 it involves declaring a temporary variable which is annoying, and
also, while the default initialization isn’t that big a problem, it’s
introducing a meaningless value into the source code and can be problematic
for non-integer types. Also, wasn’t Swift supposed to be designed so that
Optional is the only thing which has a “default” value; Bool does not
default to false and Int does not default to 0. Default constructors are
evil.

···

On Wed, Nov 8, 2017 at 7:49 PM, Johannes Weiß <johannesweiss@apple.com> wrote:

Hi Kelvin,

> On 8 Nov 2017, at 5:40 pm, Kelvin Ma <kelvin13ma@gmail.com> wrote:
>
> yikes there’s no less verbose way to do that? and if the type isn’t an
integer there’s no way to avoid the default initialization? Can this be
done with opaques or something?

well, it's 5 lines for the generic case to rule all the integers. You
could just put that in a function and never think about it again, right?

func integerFromBuffer<T: FixedWidthInteger>(_ pointer:
UnsafeRawBufferPointer, index: Int) -> T {
    precondition(index >= 0)
    precondition(index <= pointer.count - MemoryLayout<T>.size)

    var value = T()
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start:
pointer.baseAddress!.advanced(by: index),
                                                        count:
MemoryLayout<T>.size))
    }
    return value
}

should work (untested). Also you might need to handle endianness.

Regarding types that are not integers, what types are you thinking of? For
normal Swift types the layout isn't guaranteed so you can't just read the
bytes from somewhere. For C types where the layout is known you can just
use the above code and either relax the constraint or specialise it to the
very type you need. The only requirement of the type (besides that it's
layout is defined) is that is has an empty constructor.

This isn’t what I’m trying to do atm but does this mean it’s not possible
to save a memory dump of a Swift struct to a file and then read it back in
from the file to reconstitute it? Also the empty constructor requirement is
problematic as explained before, especially when the type isn’t super
simple like an Int.

It would be reasonable to put this all in extension methods on UnsafeRawPointer for unaligned loads and stores using memcpy(). (The standard library really ought to have these at some point.)

-Joe

···

On Nov 8, 2017, at 5:41 PM, Kelvin Ma via swift-users <swift-users@swift.org> wrote:

yikes there’s no less verbose way to do that? and if the type isn’t an integer there’s no way to avoid the default initialization? Can this be done with opaques or something?

1 Like

Hi Kelvin,

For context, the problem I’m trying to solve is efficiently parsing JPEG chunks. This means reading each chunk of the JPEG from a file into a raw buffer pointer, and then parsing the chunk according to its expected layout. For example, a frame header chunk looks like this:

0 1 2 3 4 5 6
[ precision:UInt8 | height:UInt16 | width:UInt16 | Nf:UInt8 | ... ]

what I want to do is to be able to load the height and width into something I can pass into UInt16.init(bigEndian:) without failing because of alignment. I’ve thought of several options but none of them seem to be great.

1 - bind the entire buffer to UInt8.self, and then do buffer[1] << UInt8.bitWidth | buffer[2]. probably most straightforward, but doesn’t generalize well at all to larger Int types.

2 - copy MemoryLayout<UInt16>.size bytes from offset 1 into the beginning of a new raw buffer, aligned to MemoryLayout<UInt16>.alignment, and do load(fromByteOffset: 0, as: UInt16.self) from that. Seems very inefficient because you have to allocate a new heap buffer copy everything over and then free it just to hold the bytes in the right alignment.

3 - use withUnsafeMutablePointer(to:_:) on a local variable of type UInt16, cast it to a raw pointer, and copy MemoryLayout<UInt16>.sizebytes into it. Like 2 it involves declaring a temporary variable which is annoying, and also, while the default initialization isn’t that big a problem, it’s introducing a meaningless value into the source code and can be problematic for non-integer types. Also, wasn’t Swift supposed to be designed so that Optional is the only thing which has a “default” value; Bool does not default to false and Int does not default to 0. Default constructors are evil.

I agree. However, that meaningless value would just exist very temporarily in a function. I think you'd need a fancier type system to express 'this is an uninitialised value on the stack that can only be read after it has been written to'. Sure you could use a local Int16? but that'd come with some overhead.

With endianness I still think you can use that function below and you'll get it super efficient. The compiler will (likely) inline that whole function anyway.

What's the problem with the local temporary variable? You'd need that in C too. Maybe can you post the C code that you'd like to write? Then we can work from there and create some Swift code that does the same.

enum Endianness {
    case little
    case big
}

func integerFromBuffer<T: FixedWidthInteger>(_ pointer: UnsafeRawBufferPointer, index: Int, endianness: Endianness = .big) -> T {
    precondition(index >= 0)
    precondition(index <= pointer.count - MemoryLayout<T>.size)

    var value = T()
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: pointer.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<T>.size))
    }
    switch endianness {
        case .little:
            return value.littleEndian /* does nothing on little endian, swaps on big */
        case .big:
            return value.bigEndian /* does nothing on big endian, swaps on little */
    }
}

-- Johannes

···

On 9 Nov 2017, at 12:30 am, Kelvin Ma <kelvin13ma@gmail.com> wrote:

On Wed, Nov 8, 2017 at 7:49 PM, Johannes Weiß <johannesweiss@apple.com>wrote:
Hi Kelvin,

> On 8 Nov 2017, at 5:40 pm, Kelvin Ma <kelvin13ma@gmail.com> wrote:
>
> yikes there’s no less verbose way to do that? and if the type isn’t an integer there’s no way to avoid the default initialization? Can this be done with opaques or something?

well, it's 5 lines for the generic case to rule all the integers. You could just put that in a function and never think about it again, right?

func integerFromBuffer<T: FixedWidthInteger>(_ pointer: UnsafeRawBufferPointer, index: Int) -> T {
    precondition(index >= 0)
    precondition(index <= pointer.count - MemoryLayout<T>.size)

    var value = T()
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start: pointer.baseAddress!.advanced(by: index),
                                                        count: MemoryLayout<T>.size))
    }
    return value
}

should work (untested). Also you might need to handle endianness.

Regarding types that are not integers, what types are you thinking of? For normal Swift types the layout isn't guaranteed so you can't just read the bytes from somewhere. For C types where the layout is known you can just use the above code and either relax the constraint or specialise it to the very type you need. The only requirement of the type (besides that it's layout is defined) is that is has an empty constructor.

This isn’t what I’m trying to do atm but does this mean it’s not possible to save a memory dump of a Swift struct to a file and then read it back in from the file to reconstitute it? Also the empty constructor requirement is problematic as explained before, especially when the type isn’t super simple like an Int.

Hi Kelvin,

>
> For context, the problem I’m trying to solve is efficiently parsing JPEG
chunks. This means reading each chunk of the JPEG from a file into a raw
buffer pointer, and then parsing the chunk according to its expected
layout. For example, a frame header chunk looks like this:
>
> 0 1 2 3
4 5 6
> [ precision:UInt8 | height:UInt16 |
width:UInt16 | Nf:UInt8 | ... ]
>
> what I want to do is to be able to load the height and width into
something I can pass into UInt16.init(bigEndian:) without failing because
of alignment. I’ve thought of several options but none of them seem to be
great.
>
> 1 - bind the entire buffer to UInt8.self, and then do buffer[1] <<
UInt8.bitWidth | buffer[2]. probably most straightforward, but doesn’t
generalize well at all to larger Int types.
>
> 2 - copy MemoryLayout<UInt16>.size bytes from offset 1 into the
beginning of a new raw buffer, aligned to MemoryLayout<UInt16>.alignment,
and do load(fromByteOffset: 0, as: UInt16.self) from that. Seems very
inefficient because you have to allocate a new heap buffer copy everything
over and then free it just to hold the bytes in the right alignment.
>
> 3 - use withUnsafeMutablePointer(to:_:) on a local variable of type
UInt16, cast it to a raw pointer, and copy MemoryLayout<UInt16>.sizebytes
into it. Like 2 it involves declaring a temporary variable which is
annoying, and also, while the default initialization isn’t that big a
problem, it’s introducing a meaningless value into the source code and can
be problematic for non-integer types. Also, wasn’t Swift supposed to be
designed so that Optional is the only thing which has a “default” value;
Bool does not default to false and Int does not default to 0. Default
constructors are evil.

I agree. However, that meaningless value would just exist very temporarily
in a function. I think you'd need a fancier type system to express 'this is
an uninitialised value on the stack that can only be read after it has been
written to'. Sure you could use a local Int16? but that'd come with some
overhead.

With endianness I still think you can use that function below and you'll
get it super efficient. The compiler will (likely) inline that whole
function anyway.

What's the problem with the local temporary variable? You'd need that in C
too. Maybe can you post the C code that you'd like to write? Then we can
work from there and create some Swift code that does the same.

enum Endianness {
    case little
    case big
}

func integerFromBuffer<T: FixedWidthInteger>(_ pointer:
UnsafeRawBufferPointer, index: Int, endianness: Endianness = .big) -> T {
    precondition(index >= 0)
    precondition(index <= pointer.count - MemoryLayout<T>.size)

    var value = T()
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start:
pointer.baseAddress!.advanced(by: index),
                                                        count:
MemoryLayout<T>.size))
    }
    switch endianness {
        case .little:
            return value.littleEndian /* does nothing on little endian,
swaps on big */
        case .big:
            return value.bigEndian /* does nothing on big endian, swaps on
little */
    }
}

-- Johannes

This solution works for what I’m doing right now. However, a semi-separate
issue comes up when T is a more complex type (or a class type). There
zero-parameter initializers don’t make sense. This sounds like something
that should be supported by Builtin.

···

On Thu, Nov 9, 2017 at 10:37 AM, Johannes Weiß <johannesweiss@apple.com> wrote:

> On 9 Nov 2017, at 12:30 am, Kelvin Ma <kelvin13ma@gmail.com> wrote:

>
>
>
> On Wed, Nov 8, 2017 at 7:49 PM, Johannes Weiß <johannesweiss@apple.com> > wrote:
> Hi Kelvin,
>
> > On 8 Nov 2017, at 5:40 pm, Kelvin Ma <kelvin13ma@gmail.com> wrote:
> >
> > yikes there’s no less verbose way to do that? and if the type isn’t an
integer there’s no way to avoid the default initialization? Can this be
done with opaques or something?
>
> well, it's 5 lines for the generic case to rule all the integers. You
could just put that in a function and never think about it again, right?
>
> func integerFromBuffer<T: FixedWidthInteger>(_ pointer:
UnsafeRawBufferPointer, index: Int) -> T {
> precondition(index >= 0)
> precondition(index <= pointer.count - MemoryLayout<T>.size)
>
> var value = T()
> withUnsafeMutableBytes(of: &value) { valuePtr in
> valuePtr.copyBytes(from: UnsafeRawBufferPointer(start:
pointer.baseAddress!.advanced(by: index),
> count:
MemoryLayout<T>.size))
> }
> return value
> }
>
> should work (untested). Also you might need to handle endianness.
>
> Regarding types that are not integers, what types are you thinking of?
For normal Swift types the layout isn't guaranteed so you can't just read
the bytes from somewhere. For C types where the layout is known you can
just use the above code and either relax the constraint or specialise it to
the very type you need. The only requirement of the type (besides that it's
layout is defined) is that is has an empty constructor.
>
> This isn’t what I’m trying to do atm but does this mean it’s not
possible to save a memory dump of a Swift struct to a file and then read it
back in from the file to reconstitute it? Also the empty constructor
requirement is problematic as explained before, especially when the type
isn’t super simple like an Int.

Is it safe to use UnsafeRawPointer instead?

func loadBigEndianInt<I>(from buffer:UnsafeRawBufferPointer, atByteOffset
offset:Int)
    -> I where I:FixedWidthInteger
{
    var i = I()
    withUnsafeMutablePointer(to: &i)
    {
        UnsafeMutableRawPointer($0).copyBytes(from: buffer.baseAddress! +
offset,
            count: MemoryLayout<I>.size)
    }

    return I(bigEndian: i)
}

···

On Thu, Nov 9, 2017 at 10:37 AM, Johannes Weiß <johannesweiss@apple.com> wrote:

Hi Kelvin,

> On 9 Nov 2017, at 12:30 am, Kelvin Ma <kelvin13ma@gmail.com> wrote:
>
> For context, the problem I’m trying to solve is efficiently parsing JPEG
chunks. This means reading each chunk of the JPEG from a file into a raw
buffer pointer, and then parsing the chunk according to its expected
layout. For example, a frame header chunk looks like this:
>
> 0 1 2 3
4 5 6
> [ precision:UInt8 | height:UInt16 |
width:UInt16 | Nf:UInt8 | ... ]
>
> what I want to do is to be able to load the height and width into
something I can pass into UInt16.init(bigEndian:) without failing because
of alignment. I’ve thought of several options but none of them seem to be
great.
>
> 1 - bind the entire buffer to UInt8.self, and then do buffer[1] <<
UInt8.bitWidth | buffer[2]. probably most straightforward, but doesn’t
generalize well at all to larger Int types.
>
> 2 - copy MemoryLayout<UInt16>.size bytes from offset 1 into the
beginning of a new raw buffer, aligned to MemoryLayout<UInt16>.alignment,
and do load(fromByteOffset: 0, as: UInt16.self) from that. Seems very
inefficient because you have to allocate a new heap buffer copy everything
over and then free it just to hold the bytes in the right alignment.
>
> 3 - use withUnsafeMutablePointer(to:_:) on a local variable of type
UInt16, cast it to a raw pointer, and copy MemoryLayout<UInt16>.sizebytes
into it. Like 2 it involves declaring a temporary variable which is
annoying, and also, while the default initialization isn’t that big a
problem, it’s introducing a meaningless value into the source code and can
be problematic for non-integer types. Also, wasn’t Swift supposed to be
designed so that Optional is the only thing which has a “default” value;
Bool does not default to false and Int does not default to 0. Default
constructors are evil.

I agree. However, that meaningless value would just exist very temporarily
in a function. I think you'd need a fancier type system to express 'this is
an uninitialised value on the stack that can only be read after it has been
written to'. Sure you could use a local Int16? but that'd come with some
overhead.

With endianness I still think you can use that function below and you'll
get it super efficient. The compiler will (likely) inline that whole
function anyway.

What's the problem with the local temporary variable? You'd need that in C
too. Maybe can you post the C code that you'd like to write? Then we can
work from there and create some Swift code that does the same.

enum Endianness {
    case little
    case big
}

func integerFromBuffer<T: FixedWidthInteger>(_ pointer:
UnsafeRawBufferPointer, index: Int, endianness: Endianness = .big) -> T {
    precondition(index >= 0)
    precondition(index <= pointer.count - MemoryLayout<T>.size)

    var value = T()
    withUnsafeMutableBytes(of: &value) { valuePtr in
        valuePtr.copyBytes(from: UnsafeRawBufferPointer(start:
pointer.baseAddress!.advanced(by: index),
                                                        count:
MemoryLayout<T>.size))
    }
    switch endianness {
        case .little:
            return value.littleEndian /* does nothing on little endian,
swaps on big */
        case .big:
            return value.bigEndian /* does nothing on big endian, swaps on
little */
    }
}

-- Johannes