Mutation and consumption in non-`Copyable` type `deinit`s

I'd like to resume the discussion from when we originally introduced non-Copyable types about allowing deinit to mutate and/or consume self. To reduce the scope of the original proposal SE-0390, we currently limit self to act as a borrowed reference inside of a deinit body. This prevents the use of mutating or consuming methods to factor out common cleanup code, or the use of partial consumption to delegate cleanup of a type's components. On the other hand, when a whole self value is passed to another consuming or mutating operation, that other operation has the potential to "resurrect" the value by transferring ownership somewhere else, or it could accidentally induce an infinitely recursive call back into the type's own deinit if the callee lets the value's lifetime end.

I've written an initial draft of the proposal from the perspective of putting no restrictions on consuming or mutating within a deinit, but would like to get feedback about various ways we might mitigate the risks. I've laid out a few possibilities in the "Alternatives considered" section of the proposal draft, and am interested in hearing people's feedback and other ideas. The current proposal is here:

and I'll repost the initial draft below:


Mutation and consumption in non-Copyable type deinits

  • Proposal: SE-ASDF
  • Authors: Joe Groff
  • Review Manager: TBD
  • Status: Awaiting implementation
  • Implementation: TBD
  • Review: (pitch)

Introduction

Non-Copyable types can define a deinit to clean up owned resources at the end of their lifetime; however, self is restricted to be immutable and borrowable only within the body of deinit up to this point. We propose to allow deinit to mutate and/or consume self or its parts.

Motivation

Many non-copyable type implementations are composed of other noncopyable values which own resources. It is natural to want to control how those components get consumed during the aggregate's cleanup:

struct File: ~Copyable {
  consuming func close() {...}
}

struct Buffer: ~Copyable {
  borrowing func flush(to file: borrowing File) {...}
  consuming func release() {...}
}

struct BufferedFile: ~Copyable {
  let file: File
  let buffer: Buffer
  
  deinit {
    // Flush then close the buffer
    buffer.flush(to: file)
    buffer.release()
    // Then close the file
    file.close()
  }
}

Or a type may provide a consuming method for more configurable cleanup, and express its deinit in terms of calling that method with standard parameters:

struct BufferedFile: ~Copyable {
  let file: File
  let buffer: Buffer
  
  consuming func close(flush: Bool) {
    if flush {
      buffer.flush(to: file)
    }
    buffer.release()
    file.close()

    discard self
  }

  deinit {
    // Flush the buffer by default
    close(flush: true)
  }
}

Along similar lines, deinit may want to use code factored into mutating methods as part of the cleanup process.

Proposed solution

We propose that deinits should be allowed to mutate and consume self. This includes either partial or entire mutation of the value.

Detailed design

"Resurrection" and accidental recursion hazards

deinit in a noncopyable type is unique among contexts that have ownership of a value: any other owning context would implicitly destroy the value by invoking deinit, whereas deinit itself of course cannot. deinit only destroys the component stored properties or inhabited enum case of the value.

This creates a wrinkle when deinit is allowed to pass self to a consuming or mutating operation. In the callee, the value is "resurrected", and the callee will invoke deinit again if it ends the value's lifetime. This could make it easy to accidentally induce an infinite loop:

struct Foo: ~Copyable {
  deinit {
    self.foo()
  }

  consuming func foo() {
    // oops, implicitly calls back into `deinit`
  }
}

struct Bar: ~Copyable {
  deinit {
    self.bar()
  }

  mutating func bar() {
    // oops, implicitly calls `deinit` on the old value of `self`
    // before reassigning it
    self = Bar()
  }
}

Generally, a consuming method usable from a deinit would use discard self to prevent the implicit call back into deinit:

struct Foo: ~Copyable {
  deinit {
    self.foo()
  }

  consuming func foo() {
    doCleanup()
    discard self
  }
}

On the other hand, aside from accidental recursion, resurrection of a noncopyable value doesn't create fundamental semantic problems, and there are situations where it would be useful for deinit to transfer ownership of the value. For instance, if cleaning up a value is time-consuming, it may make sense to enqueue a dying value to be cleaned up later rather than immediately during deinit:

let deferredCleanupValues: ConcurrentQueue<DeferredCleanup>

struct DeferredCleanup: ~Copyable {
  deinit {
    // Instead of cleaning up the value immediately, push it into the queue
    // to be cleaned up later
    deferredCleanupValues.push(self)
  }

  consuming func runTimeConsumingCleanup() async { ... }
}

func runDeferredCleanups() async {
  while let value = deferredCleanupValues.pop() {
    await value.runTimeConsumingCleanup()
  }
}

Rather than foreclose on potentially useful expressivity in the hope of making mistakes impossible, this proposal chooses not to impose any restrictions on performing mutating or consuming operations from deinit.

Remaining restrictions

It is still not allowed to capture self in a closure during deinit.

Cleanup of partially-consumed self

If any components of self have not been consumed at the point deinit returns, those remaining components are implicitly destroyed. This includes running deinit of any non-Copyable components.

Source compatibility

This proposal changes the behavior of self so that it behaves like an owned mutable binding (like a consuming function parameter), where it previously behaved like an immutable borrowing parameter. This could affect overload resolution in rare situations where an extension provides a mutating variation of a name that was previously borrowing. We expect this sort of situation to be unlikely in practice.

ABI compatibility

This proposal has no impact on ABI.

Implications on adoption

This feature can be freely adopted and un-adopted in source code with no deployment constraints and without affecting source or ABI compatibility.

Alternatives considered

There are various restrictions we could impose on operations inside of a non-Copyable deinit to prevent or reduce the likelihood of resurrection or recursion into deinit:

Only allow partial mutation and consumption

An easy way to prevent resurrection or deinit recursion would be to allow mutation and consumption of the stored properties or cases of a value, but not of the value as a whole. However, this would completely preclude the ability to factor cleanup logic into utility methods, which is a major motivation for allowing mutation or consumption in a deinit to begin with.

Annotate "deinit-safe" methods

We could limit what operations a deinit is allowed to apply to a whole value to methods that opt into being "deinit-safe" in some fashion. consuming methods so annotated would be required to discard self, and mutating methods would be prevented from fully reassigning self.

Limit deinit to invoking locally-defined methods on self

Instead of an explicit annotation, we could limit deinit to only be able to mutate or consume self via methods defined in the original type definition alongside deinit, or within the same module. This would make it possible for file- or module-local analysis to detect places where methods invoked from deinit potentially call back into deinit.

Acknowledgments

Kavon Favardin originally noted the potential problems of resurrection and accidental recursion if deinit was allowed to arbitrarily mutate or consume self.

8 Likes

In "normal" consuming methods of a noncopyable type, there is an implicit deinit/consume self that runs unless it is explicitly suppressed using discard self. Conversely, we can think of a deinit method as having an implicit discard self. In a way, deinit/consume self and discard switch roles within a deinit. I think it would make sense, then, to have to explicitly suppress the implicit discard self within a deinit. One way we could do this is by requiring consume self to be written explicitly in order to transfer ownership of self.

This might be a bit philosophical and pedantic, but I think it's worth noting that a value, after it's mutated, is in some sense a different value. Most obviously, self could be replaced by a different value entirely, such as through the swap function.[1] Additionally, a property of self that is in some sense fundamental to its identity might change; for example, perhaps a mutating method on BufferedFile could cause it to open a new file reusing the existing buffer. This raises a question of whether we should require discard self to be explicit in code paths where self has been mutated, since the self that would be discarded is now different from the self at the beginning of the deinit. Additionally, it could be worthwhile to require any mutation of self to be explicit, such as by writing &self when calling a mutating method.


I think those restrictions would help form a predictable set of rules for tracking where a noncopyable value is created, and where it is destroyed:

  1. A noncopyable value is created when an init initializes all of its stored properties.
  2. A noncopyable value is destroyed when a discard self statement executes.
  3. A noncopyable value is destroyed when a deinit finishes executing, unless it is explicitly transferred using consume self or &self.
    • The noncopyable value that is implicitly destroyed at the end of a deinit is always the same value that was originally passed in. Therefore, if self is mutated in a deinit, it cannot be implicitly destroyed using rule 3, so (if not transferred) it must be explicitly destroyed using rule 2.

Having such a predictable set of rules is important for ensuring that manually-managed resources encapsulated in a noncopyable type are handled correctly. For example, suppose File is a noncopyable struct encapsulating a file descriptor. If a consuming method closes the file descriptor, it needs to use discard self to ensure that self is truly destroyed instead of transferred (either to the deinit, or to another consuming or mutating method). If the deinit closes the file descriptor, it similarly needs to ensure that self is truly destroyed instead of transferred. Allowing implicit transfers of self would make that difficult.


  1. Since assigning directly to self within a deinit or a function called by deinit would almost never happen in correct code, I didn't use it as an example. Using swap might be contrived but I think might conceivably happen in correct code. â†Šī¸Ž