What is ~Copyable for?

I find myself surprised to be asking this question, but it’s serious. What is the right way to think about when to make a ~Copyable type?

  • Since it confers non-copyability on anything that contains it (and there are no copy constructors) its utility would seem extremely limited. You can’t use it as a low-level resource manager to be embedded in a struct, for example, unless you want that struct to be ~Copyable too (which begs the question).

  • You could use it for resource management in a class instance, e.g. to manage an additional dynamic allocation (extremely limited utility when you could, e.g. use ManagedBuffer).

  • You could use it for resource management as a local variable; defer is generally a better tool though.

  • You can use it to represent a relationship such that an invariant would be broken due to the copy, e.g. a Foo stores the id of a related Bar and it's supposed to be a 1-1 relationship. But those kinds of invariants need to be maintained in a higher-level abstraction that owns the Foos and Bars. Not being able to temporarily have a copy of a Foo exist inside that abstraction seems like an inconvenient limitation with not much benefit.

The one real use case I know of is as a mitigation for the fact that Swift makes too many copies. But adding a language feature instead of fixing the underlying problem is not the Swift way, so I must be missing something.

[I know there are planned improvements like making it possible for Array to store ~Copyable types, but features like that don’t expand the motivations for ~Copyable types; they only reduce friction once you do have them.]

Thoughts?

4 Likes

One important use-case is when you need a deinit. It's not just about (or even primarily about?) avoiding copies.

Linear types were originally invented for resource management, yeah (well, originally it was a curiosity from formal logic, until people realized it could be applied to resource management). Not sure why you see the infectious nature is a downside -- it's sort of the whole point.

Not every resource lifetime is scoped to the dynamic extent of a single function call. You might want to initialize a resource in one function and "hand it off" somewhere else, or store it somewhere.

I'm generally a "language feature skeptic" too and there are other people better positioned to defend this specific feature than I am, but generally, it seems to me that it's better from a language design standpoint if you can explicitly declare that something has a certain property and have this property be enforced at the type system level, instead of relying on a sufficiently smart optimizer. Optimizations can be hard to reason about, especially across function boundaries, are prone to being defeated by small code changes.

It is true that SILGen emits too many copies and the SIL optimizer is not able to eliminate some of them, and that should be addressed independently of ~Copyable, because in any case, most types are not expected to be ~Copyable. However, "too many" copies is one thing, but in some applications even one unnecessary copy is too much.

[Edit: One more thing is that while Swift doesn't have guaranteed call-once closures (yet?), those are another obvious application for noncopyable types which comes up repeatedly.]

16 Likes

it’s very difficult to create ~Copyable things that compose. a lot of times i have an idea, like “make a noncopyable structure that yields a view of itself through some accessor” and then i go out on a limb trying to implement it, only to find out it doesn’t compile.

many times, i conceive of the noncopyable abstraction as a technique to prevent bugs, but the ceremony needed to make noncopyable things actually compile actually ends up being a much larger bug surface for mistakes to creep in. for example, accidentally dropping a field when reconstructing self from its constituent components.

i often feel at a loss for where i am supposed to learn how to get the most out of this system, or what the actual gaps are in my understanding that i need to fill in, or what the right patterns i need to learn actually are. what few resources exist are usually unhelpful, with common tropes including:

you just haven’t internalized the key concepts yet
you’re just not using them correctly
you need to make a mental leap to grasp this completely new (and superior!) architectural paradigm

but in the end, your team will spend a lot of time trying to grasp superior architectural paradigms while coming away with little to show for it in terms of skills acquired.

4 Likes

I think “~Copyable is for deinit without shared ownership” is close to a sufficient summary most of the time. If you then ask “what’s wrong with shared ownership”, you don’t need it.

Unfortunately, anyone writing sufficiently generic types or functions may eventually have to think about it, just like async and Sendable, because eventually someone is going to ask you “hey, why doesn’t your thing work with my non-Copyable type”. Progressive disclosure, but only to a point.

(If you ask “what about using it for performance”, you’re not wrong, but the answer is longer and has a lot more caveats.)

8 Likes

So far I've used it mostly to enable statically enforcing certain semantics by defining consuming methods. For example, in some LZ4 compression code I've written this allows me to enforce that after encoding the final segment the encoder can't be used any further:

extension LZ4 {
    struct FrameEncoder: ~Copyable {
        // ...stored properties...

        mutating func compress(_ inputData: Data) throws -> LZ4.Frame.Segment {
            try compress(inputData, isFinalInput: false)
        }
        
        consuming func compressFinal(_ inputData: Data) throws -> LZ4.Frame.Segment {
            try compress(inputData, isFinalInput: true)
        }

        private mutating func compress(
            _ inputData: Data,
            isFinalInput: Bool,
        ) throws -> LZ4.Frame.Segment {
            // ...
        }
    }
}

(my understanding is that this is exactly an API that would benefit both in terms of performance and caller-side flexibility if it used Span (or RawSpan?) instead of Data, but I haven't yet taken the time to familiarize myself with the right way to use work with those types)

2 Likes

It's quite dispiriting to still see posts like this when such a large percentage of the discussion on this forum is from Swift team members discussing features related to ~Copyable.

https://forums.swift.org/t/when-are-you-supposed-to-use-ownership-modifiers/

1 Like

How so? (genuine!) Swift is ten years old, it makes sense that some areas of new development are going to be more niche. I did say “ownership” without thinking about the overlap in terminology; I was trying to avoid “ARC” when Swift doesn’t (for the most part) expose the reference counting as part of its model.

6 Likes

+1. class is shared ownership, and shared ownership comes with costs. It has a cost in terms of your ability to write correct code, since you need to make sure the type behaves reasonably in the face of multiple owners (or audit your code manually to make sure it has a single owner). It also has a cost in runtime latency to do reference counting.

Longer ramble:

There are undoubtedly many existing Swift types that have been written as class instead of struct because they need a deinit, but for which a shared ownership model does not make much sense.

This includes any stateful object which holds non-trivial resources, and for which having multiple owners is almost never what you want. Even a normal dynamic array generally falls into this category. It is stateful, it usually doesn’t guard its state for performance reasons (insert caveats about the optimizer hoisting/elimination uniqueness checks), and in most situations doesn’t make semantic sense to have 2 owners referring to the same instance.

The general Swift-y way to get around this is to give the type value semantics by making it a CoW struct. This strategy in general is awesome when it makes sense for the type/domain. However there are some types (a FileHandle being an easy example) for which CoW doesn’t make sense (files have identity, as well as you probably would never intentionally copy one through CoW), and some performance critical domains for which working with the harder ergonomics of noncopyable types is preferable to paying for runtime reference counting + uniqueness checks.

A rough checklist in my mind for making a type ~Copyable is if it meets one or more of these criteria:

  • Has mutable state and doesn’t semantically make sense to have multiple owners
  • Has a meaningful identity such that CoW doesn’t fit well
  • Is intended for performance critical domains where reference counting + uniqueness checks may not be tolerable

Also a quicker code smell: I think a class that is not Sendable is a signal that the type would be better represented as a noncopyable struct.

3 Likes

One use case I’ve found for this, which wasn’t mentioned in the proposals, is protecting data in a pipeline from being accidentally copied. The goal is to prevent data from being unintentionally saved in places where it shouldn't be, according to the design of the architecture.

While making these accidental copies doesn’t usually cause major issues, they can add up over time and lead to an implementation that doesn’t align with the intended design. These small mistakes can result in inconsistent implementation, and then not precisely correct behavior and poor adherence to the architecture.

To address this, I wrap certain data structures (and their components) in a ~Copyable wrapper. This helps enforce a stricter data flow. The ~Copyable wrapper (which often includes an underlying copyable value-semantics instances) allows the data to be borrowed for specific logic during the pipeline’s execution. The data is partially consumed strictly in the places where it is expected and is fully consumed at the end of the pipeline. This approach helps ensure that the data flow stays consistent with the design.

3 Likes

Another thing I've recently tried is to reimplement DisposeBag from RxSwift for Combine cancellables. This now looks like:

@_staticexclusiveonly
struct CancellablesBag: ~Copyable {
  ...
}

// For cases when bag need to recreated, destroying current subscriptions and creating new one.
struct CancellablesBagClass {
  private let bag: CancellablesBag
}
1 Like

I can tell you what it isn't good for. It can't protect C structs from being moved. (Some C structs, like locks, require the memory address to be constant between usages)

import Foundation

struct MyLock: ~Copyable {
	private var lock = pthread_rwlock_t()

	mutating func test() {
		withUnsafePointer(to: &lock) { lock in
			print("lock: \(lock)")
		}
	}
}

func test(_ a: consuming MyLock) {
	a.test()
}

@main
enum App {
	static func main() {
		var a = MyLock()
		a.test()
		a.test() // this currently prints the same but it's documented to be undefined behavior, so you still shouldn't depend on it: https://github.com/swiftlang/swift/blob/af3e7e765549c0397288e60983c96d81639287ed/stdlib/public/core/UnsafePointer.swift#L197-L205
		test(a) // prints a different memory address
	}
}

You still want to use UnsafeMutablePointer for those, like this: TypedNotificationCenter/Sources/TypedNotificationCenter/Core/ReadWriteLock.swift at 4053077181018d0d11249c9411a2c0c54200cbd0 · Cyberbeni/TypedNotificationCenter · GitHub

3 Likes

I’ve used that to represent the tested app expected state in ui tests. Each navigation action is modeled by a consuming method on a state that returns the new expected state. And each state has a method to assert the current state is indeed the expected state.

For example I have a DemoLoggedOutState ~Copyable struct, with a consuming method login(…) returning a MainState, and a consuming activate(…) returning an ActivationFormState. (The app is multi-user and needs to be activated before production use, but can be used with demo data before that).

I can then share the navigation steps between tests, and compose them easily, and the consistency is statically checked.

2 Likes

Here is how I see this (I know, few will agree):

When we had classes, we had single source of truth, referenced by an object variable(s) pointing to a class instance. Then Swift Concurrency came, encouraging to use structs instead, and the single source of truth was lost. When a struct is passed anywhere, a copy of the struct is made, and you don't know anymore which copy is the source of truth. IMO the ~Copiable is an attempt (dare I say "kludge") to get the single source of truth back in the brave new world of structs-first.

2 Likes

Yeah, the only way to get a stable address for inline storage currently is to use undocumented underscored compiler features meant for the standard library. Or define your storage as an imported C++ type.

1 Like

Note that even those underscored attributes give you stable storage only for the duration of a borrow, and don't prevent the owner from moving the value in between accesses. This is sufficient for atomics, which only need a fixed address when multiple observers can see them, and futex-style lightweight locks, which only really exist while locked. But pthread primitives in full portability generally require not being moved at all, so would still need to be placed in a separate allocation.

6 Likes

I have used a ~Copyable type to model a response object in a protobuf-like implementation where messages are handed off to a registered service and the service invokes a method on the response object to return a result. This function is consuming, which allows a (static) guarantee that the service only responds one time.

1 Like

That’s not a use-case in the sense I mean it. What’s an example of a problem where you “need a deinit” but using a class is inappropriate?

Not sure why you see the infectious nature is a downside -- it's sort of the whole point.

It’s the whole point up to a point. At some point I want to be able to embed my resource manager in something copyable and then explicitly determine how that something should deal with it. In C++ (if we must go there) a unique_ptr member doesn’t automatically make everything it is a part of non-copyable all the way to the aggregation root.

Not every resource lifetime is scoped to the dynamic extent of a single function call. You might want to initialize a resource in one function and "hand it off" somewhere else, or store it somewhere.

“Might want to” is not what I’m looking for here. I want to understand what specific real world use-cases are best addressed using ~Copyable and how to describe them generally. Ultimately I also want to understand why having this feature with its attendant complexity is better than not having it. For example, you can handle the pattern above without having linear types; you just have to give up being restricted from copying the resource at compile-time.

It’s awkward, but you can even get some runtime checking:

/// An instance of `T` whose value can be used once.
class Linear<T> {
  /// The instance, or `nil` if it has been used
  private var x: T?

  /// An instance with value `x`.
  public init(_ x: T) { self.x = x }

  /// Returns the value, making it unavailable for further use.
  public mutating func consume() -> T {
    precondition(x != nil, "\(T.self) was already used.")
    defer { x = nil }
    return x!
  }
}

Of course I realize there are low-level applications where the overhead of allocating a class is inappropriate; if I thought runtime checking was valuable enough to use the above in testing I’d replace it with a structthat does no runtime check for release builds.

With regard to “relying on a sufficiently smart optimizer” I agree with you 100%. But there are many extra copies that should never existed in the first place and thus shouldn’t depend on the optimizer for elimination. “You can’t count on Swift not to generate arbitrary copies” ought to instead be “these are the only scenarios where copies are created.” I still believe borrowing should have been made the default despite the extremely marginal semantic differences it introduces.

3 Likes

One common use case for ~Copyable in my projects is to wrap a pointer from a C library. That way, I don’t have to pay for additional allocations or reference counting but still get a deinit where I can call the library-specific free function.

I used to use defer for this. With that, I needed to remember to use defer every time. And defer is not usable when I return this wrapped pointer from my function since defer can only handle syntactic scope.

Most of these wrapped pointers represent objects that provide some transient interface to the C library. I do not need to store them in an array or keep them around for long. That way, the restrictions of ~Copyable are barely noticeable.

8 Likes

I’m confused by the phrasing “fixed address while locked”. At first I thought you meant “fixed address during the atomic operation,” which would not be sufficient—how could two threads synchronize on an Atomic if one of them decided in between atomic operations that the value lived somewhere else?

But Atomics must be declared var which indicates a mutable borrow for the lifetime of the variable let even though they are always mutably borrowed.[1] The unique feature of Atomic is that this mutable borrow is not exclusive. So if I understand correctly, you meant the above statement as “fixed address while mutably borrowed.”

Even then I’m not sure that’s enough given the existence of RawSpan. RawSpan allows us to “smuggle” the Atomic as immutably-borrowed raw memory. To remain useful, the Atomic must be fixed in memory even while all references to it as an Atomic have been extinguished, because one may be “resurrected” from a RawSpan that captured it. Even though RawSpan is ~Copyable, I’m skeptical that the language could reliably track the lifetime of an Atomic through conversion to and from RawSpan, so in general the compiler must never alias an Atomic to multiple memory locations.


  1. I got this wrong at first. This confusion is why I still think the design of Atomic is backward and confusing. ↩︎

1 Like

Maybe, but I think that's actually more of a core use case than the nice but maybe only nice-to-have semantic behaviors you can get with it. The ability to safely allocate and destroy memory without any reference counting, exclusivity checking, CoW overhead, is transformative for high-performance code.

While there are interesting semantic use cases, the core to my mind is the ability to allocate and destroy memory without any reference counting. The reality is that for some use cases, you cannot use ManagedBuffer because the costs it imposes are just too high.

Those performance needs are, as others have pointed out, fairly niche. The vast majority of programs (apps, servers, daemons) can eat the cost of modest reference counting overhead, extra heap allocations etc. But if your hard goal is to match the performance you could achieve with C, you cannot afford these costs. Cost here covers both compiled binary text size and runtime performance – in practice they often end up being the same thing.

Sometimes very marginal costs, of say 2-5% in runtime perf or text size, that would be noise under normal circumstances, are just too high a cost for these use cases, and these are costs that cannot be realistically eliminated by an optimizing compiler no matter how good it gets. For example, the "is the array unique" tax for copy-on-write types can add up. Yes it can be hoisted if the loop is statically visible to the compiler. But a) hoisting a check out of a loop doesn't mitigate the text size impact, and b) sometimes the loop is data-driven such as in an intepreter or parser, where static optimizations have no traction.

We have been applying Swift to more use cases like this, and ~Copyable types such as UniqueArray are invaluable in replacing C code with Swift, in both low-level performance-sensitive user space, and in "text + stack + heap has to be less than 200k" embedded use cases, without having to accept a regression.

This does lead to further questions, of course.

The primary one is "can those not interested ignore it". For the most part, the answer is yes. The Swift project remains committed to progressive disclosure and that's been at the heart of of the design of a lot of the ownership features. For example, the ability for Optional to hold noncopyable types was added without breaking source, and most Swift users remained completely oblivious of this feature. But for those interested in adopting noncopyable types, it unlocked key functionality.

Jordan is right that

but to my mind this is perfectly in keeping with progressive disclosure. Authors of generic libraries for others are expected in general to be intermediate users that probably do need to keep up with the more exotic parts of the language (they also need to understand things like specialized vs unspecialized generics, the source compatability consequences of library evolution etc). And they do still have the option of just not adopting new language features like supressed conformances, at the risk of annoying their users who do want to work with those features.

I also think, at the library level, supressing default copyability can be valuable in helping you think a little more clearly about your code. Why do I need to make a copy here, and what might the consequences of that be? Did I forget to annotate a function argument as consuming (we've found plenty of places in the std lib where we did). And – to kick another hornets nest – I do think we as a community need to confront the reality that copy-on-write is a footgun for so many users, because it leads to significant hard to detect performance problems from code where the copy was expected to be ephemeral, but wasn't (or wasn't enough).

It's also true that Swift remains relatively early in its ownership journey. The feature is usable today (and used heavily in some places) but there remain many many ergonomic wins we can make. Right now we're kind of persuing a "formula 1" model. High-end power users can make use of the feature to get C-like performance. Mixing in a little dusting of noncopyable types to speed up the core hot loop of your code sometimes works, but sometimes you hit the limits of the feature when doing this. Over time we need more support in both the language (if let needs to be able to borrow) and standard library (we need off-the-shelf things for e.g. the equivalent of Rust's RefCell without having to build it from a class+some boilerplate helpers every time, that will allow you to bridge from copyable to noncopyable idioms). Hopefully over time this means you will be able to drop in some noncopyable code to speed up you hot loop, while keeping the nice low-but-non-zero cost affordances that boost productivity everywhere else.

Finally, note that ~Copyable is just one part of ownership, alongside things like Span and borrowing vs consuming arguments, and other features.

24 Likes