SE-0525: Safe loading API for RawSpan

Hi all,

The review of SE-0525: Safe loading API for RawSpan begins now and runs through April 16, 2026.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager via the forum messaging feature. When contacting the review manager directly, please keep the proposal link at the top of the message.

Try it out

You can try out this feature by downloading an experimental toolchain:

What goes into a review?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift. When writing your review, here are some questions you might want to answer in your review:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

More information about the Swift evolution process is available here.

Thank you,

Xiaodi Wu
Review Manager

9 Likes

The proposal says,

Loading and storing in-memory types

The memory layout of many of the types eligible for ConvertibleFromBytes and/or ConvertibleToBytes is not guaranteed to be stable across compiler and library versions. This is not an issue for the use case envisioned for these API, where data is sent among running processes, or stored for later use by the same process. For more elaborate needs such as serializing for network communications or file system storage, the API proposed here can only be considered as a building block.

It then proposes that Int, UInt, Duration, ObjectIdentifier and various Range and Pointer types and even CollectionOfOne should conform to one or both of these protocols.

I think that that's equivalent to defining that the memory layouts of these types are guaranteed to be stable across compiler and library versions, because the proposal cannot exercise any control over how people use these conformances, and they will be used for persistent serialization.

I also think that it's a mistake to define conformances to these protocols for types which differ between architectures (Int, ObjectIdentifier, Pointers), for the same reason; it's a footgun for developers and a constraint on the future evolution of the language.

I think the list of conforming types should be limited to built-in integers, floating point types, inline arrays (and ideally tuples, simd vectors and PoD structs) where the elements conform.

2 Likes

The types we are proposing conformances for all have an @frozen layout, and all have the characteristic that they either have a single stored property, or all their stored properties are of the same type. In earlier proposals (SE-0333, SE-0349, SE-370, and SE-0107,) we established some rules for layout that codify the ways in which such types are compatible with each other. For any type T and a struct defined as struct Wrapper { var wrapped: T } , there is a layout equivalence between the two when Wrapper is frozen. Similarly, any struct that contains many stored Ts is layout-compatible with a homogenous tuple of the same number of T instances.

In other words, this proposal constrains nothing that hasn't already been constrained. For example, CollectionOfOne is defined exactly as Wrapper above, with protocol conformances layered on top. CollectionOfOne<Int> has the same layout as Int, Range<Int> has the same layout as [2 of Int]. Where possible, layout constraints such as BitwiseCopyable and ConvertibleToBytes should follow layout equivalence.

As for pointer-sized types such as Int and UnsafePointer, whether their conformances could be misused is beyond the scope of this proposal. We merely are trying to codify when transiting to and from untyped memory is or isn't safe. If it is safe to load or store both Int32 and Int64, then it is safe to load or store Int as well.

4 Likes

Off-topic, but this is news to me. Do you have a reference for it?

I searched for the word “struct” in all three SE proposals that you linked, and I did not find any mention of such a layout guarantee.

My recollection from discussions on this forum is that the recommendation has consistently been to define a struct with a single stored property of homogeneous-tuple type.

If that is no longer the case, I’d be delighted to learn about it!

I thought that was stated in SE-0333, but we must have taken that item out from the final version. It is rather implied by the documentation of withMemoryRebound (e.g. the second note here.) In any event, this is true of @frozen aggregates, which I was talking about above. The layout of those being frozen, it is rather likely to remain so. Wrapping a single tuple (or a single InlineArray) remains the most prudent way to achieve the effect, though.

For some reason, in SE-0333 we used the term “layout equivalent”, whereas everywhere else the term has been “layout compatible”, and that’s how 0333 didn’t come up when I originally generated the list of proposals above.

3 Likes

Thanks!

From the linked SE–0333 proposal:

Homogeneous aggregate types (tuples, array storage, and frozen structs) are layout equivalent if they have the same number of layout-equivalent elements.

So it’s guaranteed for frozen structs, but not necessarily non-frozen ones.

And nowadays it’s easy enough to have a single stored InlineArray, which works even for non-frozen structs.

3 Likes

Here’s a macOS toolchain to test this proposal:

https://download.swift.org/tmp/pull-request/88304/2284/xcode/swift-PR-88304-2284-osx.tar.gz

Ubuntu 22.04:

https://download.swift.org/tmp/pull-request/88304/1706/ubuntu2204/PR-ubuntu2204.tar.gz

Windows:

https://ci-external.swift.org/job/swift-PR-build-toolchain-windows/6465/artifact/*zip*/archive.zip

1 Like

Maybe there's a tradeoff here. I do agree that it shouldn't be encouraged to rely on ambiguous, obscure, or platform-variable details about memory layout, which is an argument for limiting ConvertibleToBytes/ConvertibleFromBytes conformances. On the other hand, I think it's desirable to avoid the need for memory-unsafe code whenever possible, which is an argument for broadening ConvertibleToBytes/ConvertibleFromBytes conformances.

Maybe the solution is to document that, even though load(fromByteOffset:as:) and store(fromByteOffset:as:) are memory-safe, they could still cause other issues such as cross-platform portability issues, so they should still be used with care. Maybe they could be given different names that do better at communicating this; for example, maybe they could incorporate a phrase such as "bit pattern" or "reinterpreting bits as", similar to how the phrase "bit pattern" is used for some BinaryInteger APIs.


These preconditions are quite complex. I wonder if it would be better for the initializer to be failable (that is, return an optional value), so the user could choose a fallback behavior if the preconditions don't hold without having to check the preconditions manually, and a force-unwrap could be used if needed.

I've noticed that some conversions between span types are initializers such as Span.init(viewing:), while others are properties such as MutableSpan.bytes. Maybe a consistent rule should be elaborated regarding which conversions are spelled as initializers, and which conversions are spelled as properties.


I think the initializer should be spelled MutableSpan(viewing:) instead of MutableSpan(mutating:), to be consistent with Span(viewing:), and to explicitly spell out the bit-reinterpretation aspect. The mutation aspect is already implied by the type name, which is (almost) always visible when calling the initializer.

I'm not sure if the overload taking a MutableRawSpan as inout should exist; it would be redundant with the consuming overload if "reborrowing" ever becomes a thing. Without "reborrowing", a workaround would be MutableSpan(mutableRawSpan.extracting(...)).

The consuming overload doesn't require ConvertibleToBytes, even though it should. The assumption seems to be that ending the lifetime of a MutableRawSpan makes the memory it points to become unobservable, so it's okay to store uninitialized bytes in the memory. This is false, because a MutableRawSpan is lifetime-bound to a mutating, not consuming, access of the base container (or sometimes even a mutating access of another MutableRawSpan, by performing a manual "reborrow" with the extracting method); the base container, and thus the memory pointed to by the MutableRawSpan, will become observable again after the lifetime of the MutableRawSpan ends. For example, the following code would invoke undefined behavior by reading uninitialized memory:

var array = Array(repeating: UInt8(0), count: 50)
var voidArraySpan = MutableSpan<[50 of Void]>(
    array.mutableSpan.mutableBytes
)
voidArraySpan[0] = [50 of Void].init(repeating: Void())
print(array)

I don't think the load(fromByteOffset:as:) and store(fromByteOffset:as:) functions should default the fromByteOffset parameter to zero. With the absence of an explicit byte offset, I think the expectation would be that the entire memory range is being operated on, not a prefix of it. I think, if there are overloads of these functions that lack an explicit byte offset, they should have the precondition that the span has the same size (or stride) as the type being loaded or stored. This is particularly true, I think, for the integer versions of these APIs; it seems confusing that bytes.load(as: Int.self, .littleEndian) would only read a prefix of bytes.


I don't think the integer loading and storing APIs should require ConvertibleFromBytes & FixedWidthInteger. It's possible for a custom FixedWidthInteger type to not store the bytes of the integer inline in the platform-native format. I would prefer an API that uses the BinaryInteger protocol requirements to convert values to and from bytes, and thus only requires FixedWidthInteger, instead of making assumptions about the memory layout of such a type.


I don't understand how this is true. Presumably, the storeBytes function will be implemented as a memcpy operation. To my knowledge, memcpy also copies the initialization state of each byte, so each destination byte becomes uninitialized if and only if the corresponding source byte is uninitialized. In other words, when an uninitialized byte is loaded from memory, and then written to some other byte in memory, the destination byte in memory becomes uninitialized, even if it was previously initialized. In LLVM terms, it isn't a no-op to store undef or poison to previously-initialized memory. For example, in C, if src is uninitialized and dest is initialized, and then memcpy(&dest, &src, sizeof(dest)) is executed, then dest becomes uninitialized.

Practically speaking, since it likely wouldn't be known which bytes are padding bytes, the uninitialized bytes would likely just be copied into the memory pointed to by the mutable span, and thus become observable in safe code, potentially leading to the kinds of security vulnerabilities associated with reading uninitialized memory. For example, if a value of type [50 of Void], which is just 50 bytes of padding, is passed into storeBytes, in a context where the optimizer doesn't optimize away the copying of padding, then the uninitialized contents of the padding bytes are copied and become observable in safe code.


I think it should be possible for Never to conform to ConvertibleToBytes, even though Never can't conform to ConvertibleFromBytes.

Padding bytes can be dependent on the specific value, not just on the type. That is, different values can have different padding bytes. For example, different cases of an enum with associated values can have different padding bytes. So it's possible for some values of a type to have padding bytes, and other values of the same type to not have padding bytes.

So, we can say that a type can conform to ConvertibleToBytes if and only if, for all values of the type, the value has no padding bytes. This is vacuously true for Never, because Never has no values. Therefore, Never can conform to ConvertibleToBytes.

1 Like

I don’t disagree; the properties are modifications of the existing ones from SE-0447 and SE-0467; some of the new initializers are the ones required for the initializers, now made public.

This is interesting. The existing API on UnsafeRawBufferPointer have this shape, with defaulted parameter, and I’ve not seen this criticism of them.

Very simply, MemoryLayout<Never>.stride - MemoryLayout<Never>.size equals 1, ergo there is a padding byte. For now we only allow types for which size and stride are equal to conform to ConvertibleToBytes. In practice this should not matter though; what code can we write when Never conforms that cannot be written if it does not?

Thanks for the discussion of memcpy; I’ll look some more into the shortcomings.

1 Like