[Pitch] Noncopyable (or "move-only") structs and enums

Andrew_Trick · December 18, 2022, 12:13am

[EDIT] This is still a strawman. If you have any counter-examples please share!

Summary

consuming values are eager-drop.

borrowing values are associated with a lexical scope.

Noncopyable values have strict lifetimes.

Copyable values have optimized lifetimes.

Strict Lifetimes

Strict eager-drop values give us a last-source-level-use rule.

Strict borrowing values give us a lexical lifetime rule.

Optimized Lifetimes

With optimized lifetimes, deinitialization is unordered.

An optimized consuming lifetime is "eager-drop", but is not a "last-source-level-use" rule. The optimizer must be able to freely substitute and delete copyable values. consuming value does not keep a value alive if it is copyable.

Optimized borrowing lifetimes follow our current default rules for let variables. An optimized borrowing lifetime is not a "lexical lifetime". The optimizer does track each variable's scope, but that scope is only restricted by deinitialization barriers.

Motivation

Generic lifetimes must be at least as strict as their specialized counterpart.

Copyable generic types cannot have strict lifetimes. ARC would not be optimizable if releases all had the semantics of a virtual C++ destructor defined in another module.

Noncopyable types don't require ARC optimization because retains and releases are only a materialization of copies.

Non-goal: An explicit borrowing or consuming modifier should dictate the ownership and lifetime rules independent of the type. While this is highly desirable, it directly conflicts with other goals.

consuming should eager-drop

We need a lightweight way to tell the compiler it can aggressively drop variables. Otherwise, the compiler is burdened by supporting certain patterns of weak references, unsafe pointers, and deinit side effects. That's a substantial burden for optimizing ownership because it's often impossible to prove that those patterns aren't present. These specific caveats aren't relevant to this thread. What's relevant is that sprinkling another layer of annotations around the code to control lifetimes, in addition to consuming and borrowing is a bad model.

This is all been shown by experiment.

Example:

struct Container {
  func append(other: consuming Container) {
    push(other) // Do not copy 'other' to keep it alive across 'push'
  }
}

borrowing should not eager-drop

First, here's why we can't eager-drop:

@noncopyable
struct FileWrapper {
  let handle: Handle

  [borrowing] func access() -> Data {
    handle.read() // self *cannot* be destroyed after evaluating 'handle', but before calling 'read'
  }
  consuming func close() {
    handle.close()
    forget self
  }
  deinit {
    close()
  }
}

Borrows need some relationship with their lexical scope.

We still have three viable options:
Option 1: optimized borrow lifetimes (just like `let)
Option 2: strict borrow lifetimes
Option 3: optimized copyable and strict noncopyable borrow lifetimes

TLDR; Only option #3 meets our goals.

The FileWrapper example above is currently safe if borrowing variables inherit optimized let lifetimes. The reason is that external calls to read or close a file handle are considered synchronization points. This does not mean that borrows need strict lexical lifetimes. We still have a choice. The question is whether we want to optimize copies or make struct-deinitialization immune to optimization.

The conundrum is this:

For copyable types, we need to optimize copies.
For noncopyable types, we want predictable deinitialization points
If borrowing is explicit, then lifetime behavior should (ideally) not depend on copyability

One of these needs to give.

let lifetime semantics allow optimization of copies, but do not specify "well-defined" deinitialization points. By today's rules, we will optimize the extra copy in this example:

struct Value {
  static var globalCount = 0

  borrowing func borrowMe() -> Value {
    let value = copy self
    globalCount += 1
    return value
  }

  deinit {
    globalCount = 0 // strange I know
  }
}

let value = Value().borrowMe()
//... 'Value.i' might be 0 or 1

Option 1: optimized borrow lifetimes

Ask users to use either call a consuming method, or use withFixedLifetime or deinitBarrier if they expect deinitialization to occur at a specific point.

To fix the unusual case of Value.globalCount above, the programmer would need to "synchronize" their deinitializer by adding a barrier to the method that accesses the Value:

struct Value {
  static var i = 0

  borrowing func borrowMe() -> Value {
    let value = copy self
    i += 1
    deinitBarrier() // withExtendedLifetime() also works
    return value
  }

  deinit {
    i = 0
  }
}

let value = Value().borrowMe()
//... 'Value.i' might be 0 or 1

Deinit barriers are required for class deinits regardless of how borrowing behaves. Any access to an external function or variable acts as a barrier. As does any concurrency primitive like await. We can trivially provide a deinitBarrier API in the standard library if it's useful. Or we can provide a "keep alive" keyword which behaves like consuming but doesn't actually consume.

This programming burden is, however, worse for struct deinits. Unlike with automatically managed objects, the programmer naturally anticipates the point at which struct deinitialization occurs. It's also an unnecessary burden because structs with deinits are noncopyable types. Optimizing their lifetime does not eliminate copies.

Option 2: strict borrow lifetimes

With this option, borrow becomes a "keep alive" keyword for any variable.

This option will significantly harm optimization as we migrate toward borrowing as standard practice. This is especially unwelcome because programmers are using borrowing precisely to optimize copies.

There will be quality of implementation issues. For example, we'll need to prevent the optimizer from deleting dead values and substituting equivalent values with different lifetime constraints. I suspect the compiler will never completely get this right.

Option 3: optimized copyable and strict noncopyable borrow lifetimes

With this option, copyable borrows can be optimized just like let variables today. Migrating to borrowing won't prevent ARC optimization.

With noncopyable borrows, there's no serious performance concern. Objects may be freed later then otherwise, but that matches expectations.

This requires optimizer support, but it mostly falls out naturally from our representation of noncopyable values. The problematic optimizations will largely be disabled for noncopyable values.

It might confuse programmers that borrowing is optimized more aggressively for copyable types. But it's natural that struct-deinits have somewhat special lifetime rules. And the optimization impact is only noticeable when unwanted copies are present.