[Pitch] BitwiseCopyable marker protocol

nate_chandler · February 9, 2024, 10:44pm

We are proposing a new marker protocol to identify types that can be trivially copied, moved, and destroyed. This supports @glessard’s recent StorageView pitch and fills a long-standing gap in our ability to write certain sorts of performant low-level code.

Introduction

We propose a new marker protocol BitwiseCopyable that can be conformed to by types that can be moved or copied with direct calls to memcpy and which require no special destroy operation[^1].
When compiling generic code with such constraints, the compiler can emit these efficient operations directly, only requiring minimal overhead to look up the size of the value at runtime.
Alternatively, developers can use this constraint to selectively provide high-performance variations of specific operations, such as bulk copying of a container.

Motivation

Swift can compile generic code into an unspecialized form in which the compiled function receives a value and type information about that value.
Basic operations are implemented by the compiler as calls to a table of "value witness functions."

This approach is flexible, but can represent significant overhead.
For example, using this approach to copy a buffer with a large number of Int values requires a function call for each value.

Constraining the types in generic functions to BitwiseCopyable allows the compiler (and in some cases, the developer) to instead use highly efficient direct memory operations in such cases.

The standard library already contains many examples of functions that could benefit from such a concept, and more are being proposed:

The UnsafeMutablePointer.initialize(to:count:) function introduced in SE-0370 could use a bulk memory copy whenever it statically knew that its argument was BitwiseCopyable.

The proposal for StorageView includes the ability to copy items to or from potentially-unaligned storage, which requires that it be safe to use bulk memory operations:

public func loadUnaligned<T: BitwiseCopyable>(
  fromByteOffset: Int = 0, as: T.Type
) -> T

public func loadUnaligned<T: BitwiseCopyable>(
  from index: Index, as: T.Type
) -> T

Proposed solution

We add a new protocol BitwiseCopyable to the standard library:

@_marker public protocol BitwiseCopyable {}

Many basic types in the standard library will conformed to this protocol.

Developer's own types may be conformed to the protocol, as well.
The compiler will check any such conformance and emit a diagnostic if the type contains elements that are not BitwiseCopyable.

Furthermore, when building a module, the compiler will infer conformance to BitwiseCopyable for any non-resilient struct or enum defined within the module whose stored members are all BitwiseCopyable.

Developers cannot conform types defined in other modules to the protocol.

The full proposal can be found here:
BitwiseCopyable

Joe_Groff · February 9, 2024, 11:14pm

Thanks for writing this up!

We need to have a reckoning about this term, because nothing besides Sendable we've added to the core language so far really strictly satisfies its properties. Whether a type is BitwiseCopyable or not is dynamically discoverable (we could support x is BitwiseCopyable if we want to), and it's an intrinsic property of the type—as the proposal notes, it can't be retroactively imposed on external types, and whether a type satisfies the constraint or not is strictly determinable from its layout.

I think that an instantiation of generic type should be determinable without requiring an explicit conditional conformance, since it'll be too big of a burden on existing code to have to add explicit conditional conformances to existing types, and developers won't be able to retroactively assert BitwiseCopyable on existing libraries. Rather than thinking of this as "inference" of a conditional conformance, I think we should check whether each concrete instantiation of a type individually satisfies the bitwise copyability requirements when it's used in a context that requires BitwiseCopyable. Developers could of course state the explicit conditional conformance if they want the compiler to check that it holds, or promise it as public API.

Whether a module is built with library evolution or not should not affect its outward behavior. (The cases where we already do have different behavior we've generally realized are mistakes after the fact.) A public type from another module should not be automatically usable as a BitwiseCopyable type unless it's either explicitly @frozen or explicitly publicly states the BitwiseCopyable conformance, regardless of library evolution mode.

jrose · February 9, 2024, 11:44pm

I’m going to suggest that “Copyable” shouldn’t be in the name of this protocol; a type can be non-Copyable but still bitwise-movable and trivial to destroy. It doesn’t come up a lot (nearly all non-Copyable types have deinits), but when it does it seems like such types should be eligible for whatever optimizations someone might hinge on this protocol.

At the same time I understand the desire to avoid the loaded terms “trivial” and “POD”, and because Swift has types with non-memcpy move operations, “deinit-less” or similar doesn’t cut it.

glessard · February 9, 2024, 11:55pm

fwiw we expect something like BitwiseMovable to eventually exist too. Many things would be BitwiseMovable without being BitwiseCopyable, such as AnyObject.

allevato · February 9, 2024, 11:55pm

I didn't see the syntax mentioned in the proposal, but should a type be allowed to explicitly suppress its implicit BitwiseCopyable conformance like this?

struct Foo: ~BitwiseCopyable {
  var x: Int
  var y: Int
}

I can't think of a great reason to want to do this, though. Maybe if you don't want to promise that future changes to the type would continue to have that property, so people don't use it in generic contexts that require a T: BitwiseCopyable that could break later? Would that be useful, though?

I do wish we had a better way to, perhaps syntactically, distinguish these kinds of layout constraints or intrinsic traits from other protocols. As the set of these grows—AnyObject, Copyable, BitwiseCopyable, Escapable...—I fear that it's going to be harder to explain that there's some set of "protocols" that are synthesized by default and some protocols that are only synthesized if you ask for it, and it's not obvious which are which because they share the same namespace and syntax. Especially since we've consciously made the decision to shift away from implicit synthesis for behavior-providing protocols like Equatable, Hashable, Codable, and so forth.

If we're talking about supporting dynamic casts for these constraints, then maybe they're already "protocol-like" enough that it would be too awkward to distinguish them further. But it feels like we're moving in a direction that's less-than-ideal for user education.

Joe_Groff · February 10, 2024, 12:12am

allevato:

I do wish we had a better way to, perhaps syntactically, distinguish these kinds of layout constraints or intrinsic traits from other protocols. As the set of these grows—AnyObject, Copyable, BitwiseCopyable, Escapable...—I fear that it's going to be harder to explain that there's some set of "protocols" that are synthesized by default and some protocols that are only synthesized if you ask for it, and it's not obvious which are which because they share the same namespace and syntax. Especially since we've consciously made the decision to shift away from implicit synthesis for behavior-providing protocols like Equatable, Hashable, Codable, and so forth.

If we're talking about supporting dynamic casts for these constraints, then maybe they're already "protocol-like" enough that it would be too awkward to distinguish them further. But it feels like we're moving in a direction that's less-than-ideal for user education.

I agree. This is part of why I've been trying to shoo people away from calling them "marker protocols" even if that happens to be a convenient way to bootstrap their implementations. Internally, the compiler calls AnyObject a "layout constraint", and it also has an internal _Trivial layout constraint that's currently only used for partial specialization but otherwise represents the same intrinsic layout requirements as BitwiseCopyable; I'm not sure whether that's the best user-facing term though. With dynamic discoverability, I think the biggest inherent difference between true protocols and these layout constraints is that protocol conformances are extrinsic to the type: the conformance has an independent identity from the type itself; a conformance can be retroactively added to a type, and technically there can be multiple retroactive conformances of the same protocol to the same type (even though we discourage that). These layout constraints on the other hand are intrinsic to the type; they arise from fundamental properties of its definition. A type can't be retroactively made BitwiseCopyable or AnyObject if it wasn't originally defined to satisfy those constraints. And while we can support declaring explicit "conformances" to a layout constraint, there's no way to satisfy that constraint except by its definition meeting the inherent requirements of the layout constraint.

allevato · February 10, 2024, 12:26am

Another question that just occurred to me around suppression; how does BitwiseCopyable interact with Copyable? If I write the following type:

struct Foo: ~Copyable {
  var x: Int
  var y: Int
}

does it still get an implicit conformance to BitwiseCopyable? Does it make sense today in Swift to have a type that's BitwiseCopyable but not Copyable, or vice versa?

I say "today" above because I can imagine a future where we let value types supply their own custom implementation of copying—I have a C library with custom allocate/free/copy functions that I wish I could wrap as a simple struct in Swift without introducing the overhead of a class internally. It would make sense for such a type to be Copyable & ~BitwiseCopyable.

BitwiseCopyable & ~Copyable is harder for me to wrap my head around, but maybe I'm not being sufficiently imaginative.

Dante-Broggi · February 10, 2024, 12:34am

Perhaps call the intrinsic protocols "Traits", as like Rust traits only the module which declares the type can declare conformances to them?

Joe_Groff · February 10, 2024, 1:29am

allevato:

Another question that just occurred to me around suppression; how does BitwiseCopyable interact with Copyable? If I write the following type:
struct Foo: ~Copyable {
  var x: Int
  var y: Int
}
does it still get an implicit conformance to BitwiseCopyable? Does it make sense today in Swift to have a type that's BitwiseCopyable but not Copyable, or vice versa?

I would take BitwiseCopyable to require Copyable as a prerequisite. To @jrose's point, we could possibly narrow it down into a NoOpDeinit requirement, that says that no work has to be done to destroy a value of the type. Then BitwiseCopyable might be the composition of NoOpDeinit & BitwiseMovable & Copyable. (A type could plausibly hold interior pointers to its own representation but not require cleanup, which would render it as having a trivial deinit while still not being bitwise-movable or copyable since those interior pointers would then need fixup during a move or copy.)

taylorswift · February 10, 2024, 1:33am

Joe_Groff:

I'm not sure whether that's the best user-facing term though. With dynamic discoverability, I think the biggest inherent difference between true protocols and these layout constraints is that protocol conformances are extrinsic to the type: the conformance has an independent identity from the type itself; a conformance can be retroactively added to a type, and technically there can be multiple retroactive conformances of the same protocol to the same type (even though we discourage that). These layout constraints on the other hand are intrinsic to the type; they arise from fundamental properties of its definition. A type can't be retroactively made BitwiseCopyable or AnyObject if it wasn't originally defined to satisfy those constraints. And while we can support declaring explicit "conformances" to a layout constraint, there's no way to satisfy that constraint except by its definition meeting the inherent requirements of the layout constraint.

just wanted to chime in to say i have absolutely no problem with surfacing the term layout constraint. i have found it frustrating that official documentation continually insists on calling things like AnyObject “protocols” when they behave so differently from actual protocols.

there’s tons of compiler words i have come across that i would consider “jargon” (what on earth is a ValueDecl?) but layout constraints aren’t one of them.

Nobody1707 · February 10, 2024, 6:20am

Yes, please. I've wanted this for a long time.

I don't suppose we'd ever want something like C++23's std::is_implicit_lifetime or ByteMuck's AnyBitPattern trait? i.e. A type that can be created from any appropriately sized set of initialized bytes (no invalid bitpatterns, trivially destructible, etc.)

wadetregaskis · February 10, 2024, 3:33pm

How important is it for this to be in the type system? e.g. is this something that needs programmer-visible semantics, or is it truly just about runtime performance?

I would have thought the cost of a bit in the metatype (to say whether the type is trivially copyable) and the branch on that bit (at relevant points of copying) would be virtually free (in the grand scheme of things).

I too have an uneasy gut feeling about this increasing proliferation of new general protocols (markers or otherwise), in terms of the complexity it brings to the language. So it'd be wise to be sure it's unavoidably necessary in this case (as in all others).

ksluder · February 10, 2024, 4:50pm

It needs programmer-visible semantics. See the various APIs on UnsafeMutablePointer that are restricted to trivial (bitwise-copiable) types.

glessard · February 10, 2024, 5:36pm

Furthermore, the standard library is implemented in Swift, so it needs access to knowledge about characteristics that can be exploited for optimization. It’s not clear there’s a better way than the type system to obtain that knowledge. We have a bit in the metatype, and it’s not enough.

benlings · February 10, 2024, 10:30pm

You describe them as ‘intrinsic protocols’, and I don’t think that would be a bad name. It makes it clear that conformances can’t be added retroactively. ‘Layout constraint’ isn’t bad; but Swift calls constraints ‘protocols’ and ‘layout’ seems too implementation-focussed rather than based on the semantics.

Karl · February 11, 2024, 12:40am

I like the term layout constraint, and I think we should refine the overall concept of "marker protocols".

Swift already has more than just protocol conformance constraints -- it also has superclass constraints. For example, the following works:

protocol MyProto {}
class MyBaseClass {}

func test(_: any MyBaseClass & MyProto) {}
func test2(_: some MyBaseClass & MyProto) {}

Sendable is kind of interesting. It isn't like a normal protocol; it doesn't have any member or associated type requirements, and the requirements it does have can't be expressed in the language. It's a promise that a thing is thread-safe, and so I would describe it more as a behavioural constraint.

Additionally, it can be applied to things which don't support protocols, like functions and closures, and with the recent KeyPath & Sendable change, we've introduced the idea that whether a particular key-path "conforms" to Sendable doesn't even depend on its type.

Think about that for a minute - we've had conditional protocol conformances before, which said that Array<Element> conforms to Equatable when Element: Equatable but now -- just for Sendable -- we have a KeyPath<Root, Value>, and whether or not it conforms to Sendable is not knowable even if you know KeyPath, Root, and Value. It has become divorced from the type, and can now tag particular values.

We also have ~Copyable and (soon) ~Escapable constraints. Technically I guess these are negations of Copyable and Escapable "protocols", but those protocols are only useful in that they allow you to express inverse constraints. Again, they are not like normal protocols, and nobody should need to write something like <T: Copyable> because it is always implicit, so they won't be used like traditional protocols.

So yeah, I think we should stop calling these things protocols, start referring to them as constraints, and embrace that there are more interesting constraints than just protocol conformances.

Jumhyn · February 11, 2024, 5:59am

I don’t think this is quite so different-in-kind from how keypaths (and class hierarchies more generally) have always behaved. Since subclasses can add conformances, it’s always been the case that if a subclass D of a class C adds a conformance to P, then whether a value conforms to P isn’t knowable even if you know C. So my mental model for sendable keypaths is as though there were a suite of Sendable*KeyPath subclasses which express this additional axis for keypaths (but these subclasses are kept private to avoid exploding the number of classes we actually need to deal with). Then, the only way to utter such keypaths is to note the fact of the additional Sendable conformance in the type separately from the keypath type. And it’s always been the case that the particular keypath subclass you’re allowed to form depends on specifics of the keypath value itself (e.g. composed mutability along the member chain), so it’s not that surprising that this would extend to sendability too.

Karl · February 11, 2024, 9:51am

Grammatically it might look like you're expressing "some Sendable subclass of KeyPath", but once you look more closely it becomes clear that that mental model can only be a macroscopic approximation, and that reality involves quantum effects:

It isn't actually possible to express this relationship using class sub-typing: WritableKeyPath & Sendable is a subtype of WritableKeyPath, KeyPath, and KeyPath & Sendable. However, KeyPath & Sendable and WritableKeyPath do not have any sub-typing relationship between them. There is no line which connects those dots if you view them as concrete Sendable{X}KeyPath types.
The dynamic type is discoverable at runtime, regardless of whether it is private. So we can just directly ask and see that sendable and non-sendable key paths in fact have the same type.

So really the only explanation I can see is that, when we're talking about keypaths, the & Sendable constraint refers to the value, not the type. It's not "some Sendable subclass of KeyPath", it's "some subclass of KeyPath, and the value is Sendable".

Regex has a similar problem to KeyPath. But since Sendable constraints can be applied orthogonally to class sub-typing, there is the possibility that we could use a similar Regex & Sendable construct there.

Anyway, it's not directly relevant to this thread to get in to the details. Suffice to say, Sendable is pretty weird as far as protocols go. If we dropped the "protocol" terminology and instead called it a "sendability constraint" or something along those lines, it's unlikely to hurt (and quite possibly may help) understanding.

Jumhyn · February 11, 2024, 3:11pm

Yeah, Sendable is totally a departure from ‘normal’ protocol behavior any way you slice it. I’m totally on board with the idea that we should crystallize how we present and talk about these pseudo-protocols in the language.

dnadoba · February 11, 2024, 4:42pm

Why is it called BitwiseCopyable instead of BytewiseCopyable? Every proposed API can only handle full bytes.

I'm asking because some networking protocols define messages which contain bit sized fields which are not full bytes e.g. IPv4.

It would be great if we had safe APIs in the future that would allow reading/writing these kind of fields safely and BitwiseCopyable would be a fitting name for these kind of types e.g. Int4. BytewiseCopyable could then also inherit from BitwiseCopyable as every type that can be copied byte wise can also be copied bit wise but not the other way around.