[Pitch] Non-Escapable Types and Lifetime Dependency

Andrew_Trick · May 29, 2024, 8:43pm

It seems that you're not disagreeing, given that you just restated my point.

If an API has a lifetime dependence, then anyone using that API needs to be aware of that. That's why it's best to communicate that information directly to the client in the form of a lifetime dependence. My hypothesis is that programmers will never need to think in terms of lifetime variables to understand lifetime dependence. It may only be the programmers with a Rust background who do the mental translation. So, most Swift programmers will need to understand lifetime dependence, but most Swift programmers will not need to understand lifetime variables.

Andrew_Trick · May 29, 2024, 8:51pm

rauhul:

func bufferForThunderboltDMARing(
  descriptor: UInt32
) -> dependsOn(descriptor) BufferWriter {
  // Yield back a writable buffer to fill thunderbolt buffer with usb4 packets.
}
Also (IIRC) the Pointer types are now BitwiseCopyable. This statement implies you cannot write a function with a lifetime based on that of a pointer argument which seems like a huge miss.

This is the most basic use case that nonescapable types are built on top of. Take a look at Depending on an escapable BitwiseCopyable value

ksluder · May 29, 2024, 10:12pm

I think it’s worth noting that Swift’s generics system has first-class type variables, rather than requiring developers to specify all types in terms of values that carry those types.

Andrew_Trick · May 29, 2024, 10:38pm

It's true that "borrowing a BitwiseCopyable value" is meaningless in terms of Swift's current language rules--it simply doesn't change the program semantics. But we do have a coherent strategy for depending on BitwiseCopyable values. In that respect, the April post that you're quoting from is out if date now, and I can't edit it.

That really is the point. Only you, the programer know the meaning of that integer. The compiler does not know, for example, that destroying a "file" object invalidates your descriptor.

Many integers have important lifetimes which the programmer may absolutely want to base another value's lifetime on.

The integer itself does not have a lifetime. The compiler only knows the scope of the variable holding the integer. It's on you to make sure the integer is valid over the entire scope of that variable.

rauhul:

For example: a function which returns a writeable buffer for a thunderbolt ring whose lifetime is limited to however long the descriptor is owned by caller.
func bufferForThunderboltDMARing(
  descriptor: UInt32
) -> dependsOn(descriptor) BufferWriter {
  // Yield back a writable buffer to fill thunderbolt buffer with usb4 packets.
}

It is important to enforce dependencies on BitwiseCopyable values, and your function above is valid Swift. I just want to be clear that depending on a BitwiseCopyable value is potentially unsafe, so you need to design such APIs carefully. A dependency: a dependsOn b means that a must be destroyed before b. If b is BitwiseCopyable, then it's destruction has no semantics and no effects. So the programmer must actually have wanted a dependency on some other effect, which the compiler can't figure out.

All the compiler can do is say the the result of bufferForThunderboltDMARing must not escape the scope of any variable passed as the argument to descriptor. The programmer needs to separately guarantee the lifetime, or exclusive access of whatever thing the integer actually represents:

let descriptor = ...
let buffer = bufferForThunderboltDMARing(descriptor: descriptor)
// You better make sure desciptor is valid for the remainder of this scope.
// If descriptor comes from a managed object, you can use `withExtendedLifetime()`

jcavar · May 30, 2024, 2:28pm

This was a source of confusion for me as well.

For example, this code snippet:

struct Large {
    let a1: Int
    // a2, a3, ...
    let a10000: Int
}

func first(large: borrowing Large) -> Int {
    large.a1
}

My understanding is that this is semantically the same as:

func first(large: Large) -> Int {
    large.a1
}

Is that correct? If so, isn't this a trap for programmer who is reading this code?

Andrew_Trick · May 31, 2024, 1:33am

You're right. Copying large structs is a serious issue. But there are different aspects of that problem which make it impossible to give you a simple answer.

Language semantics

It's always legal for the compiler to copy BitwiseCopyable values without affecting program semantics. That's mainly what I was saying. Of course, the compiler's choice of codegen impacts performance.

Practical optimizer improvements

Very recently @Arnold tracked down and fixed the most common places where the optimizer copied large structs. This is a messy problem though that cross-cuts all layers of the compiler.

Current source workarounds

~Copyable with a clone() method.
Boxing the struct in a ref-counted class.

Incorrect source workaround: borrowing

This is the other thing I was getting at... The borrowing parameter convention is supposed to guard against accidental source-level copies by making sure the value is only "consumed" once. This partially works for non-BitwiseCopyable values (it issues some diagnostics). But the diagnostics are wrong for BitwiseCopyable values. "Consuming" a BitwiseCopyable by definition has no effect. These two function signatures are literally identical from the caller's perspective:
func f(s: borrowing some BitwiseCopyable) ≡ func f(s: consuming some BitwiseCopyable).

The current diagnostics issue an error here, even though the compiler will not actually emit any copies of the struct:

func bar(s: consuming LargeStruct) {}

func foo(s: borrowing LargeStruct) {
  // 🛑 error: 's' is borrowed and cannot be consumed
  bar(s: s)
  _ = s
}

And the current diagnostics fail to issue an error here, even though the compiler will copy the entire struct:

func bar(s: borrowing LargeStruct) {}

func foo(s: inout LargeStruct) {
  bar(s: s) // Copy 's' into the callee's "borrowed" argument.
}

So, the borrowing diagnostics should be disabled for BitwiseCopyable because they have no value in any cases and are backward in most cases.

jcavar:

For example, this code snippet:
struct Large {
    let a1: Int
    // a2, a3, ...
    let a10000: Int
}

func first(large: borrowing Large) -> Int {
    large.a1
}
My understanding is that this is semantically the same as:
func first(large: Large) -> Int {
    large.a1
}
Is that correct? If so, isn't this a trap for programmer who is reading this code?

Yes. The borrowing parameter modifier tends to be misleading because people often want it to have some effect on the caller's semantics for copyable types, but that would break source compatibility. We need to keep communicating that those ownership parameter modifiers only determine whether the callee takes ownership of its argument and no other aspect of argument passing. For copyable types, that affects where reference counting operations are required. For noncopyable types it does determine the lifetime of the argument. It doesn't mean a thing for BitwiseCopyable.

Planned partial source non-workaround

Once we have a borrow operand, we can pass a mutable struct with an exclusivity check that avoids the current copy-by-default convention:

func bar(s: borrowing LargeStruct) {}

func foo(s: inout LargeStruct) {
  bar(s: borrow s) // ✅ OK: no copy needed.
}

Future optimizer guarantees

Programmers mainly want predictable behavior rather than relying on a best effort optimizer. In my opinion, we should provide a formal guarantee for certain types that the optimizer will not introduce any new copies that don't directly fall out of source-level semantics. This should apply to both non-BitwiseCopyable types and "large" BitwiseCopyable types, where "large" is defined by the argument passing ABI. We need some architectural compiler improvements to get to this point.

ABI

The compiler will always need to copy values to uphold the ABI requirements on the value's physical representation. The argument passing convention is optimized to avoid this for large values. But the struct will be copied as soon as you need to store it in another type. We do plan to introduce "borrowed properties" to avoid logical copies:

struct Ref<T> /*: ~Escapable */ {
  borrow value: T // strawman syntax
}

But this won't help with large structs. The borrowed value will still hold a bitwise copy of the original.

Safe pointers

A Span (or BufferView) type has been proposed. We'll also want to introduce a type that's equivalent to a single-element span, let's say SafePointer. This will finally, actually let you borrow a large struct without physically copying it:

struct Ref<T> /*: ~Escapable */ {
  pointer: SafePointer<T>
}

Alejandro_Martinez · June 3, 2024, 6:45pm

I'm a bit confused with the requirement on initializers:

Since nonescapable values cannot be returned without a lifetime dependency, initializers for such types must specify a lifetime dependency on one or more arguments.

But given the immortal description:

Once the escapable instance is constructed, it is limited in scope to the caller's function body since the caller only sees the static nonescapable type.

Wouldn't that just work for inits? That is you construct a nonescapable instance and it can just be used in the scope of the caller, just as the optional.none example.

I guess I'm asking, would this work?

struct S {
  init() -> dependsOn(immortal) Self
}

and if so, wouldn't it be a better default?

I don't have much experience with this sort of code so maybe doesn't make sense, but is a question that came to mine after reading the proposal. Other than that, it is a great description of the problem and the solution even for non-experts like me so kudos to the writers

Andrew_Trick · June 3, 2024, 9:15pm

Great question. It will probably work, but depends on the implementation of the initializer.

Remember that a nonescapable value always depends on something. So, when you initialize one, you normally need to pass in the thing that the new value will depend on as an argument to the initializer.

If you have a concrete example of why you want an initializer with no arguments, that might be more illuminating. This will usually be needed when the nonescapable type holds an Optional value and we need a way to represent that empty state (I would generally discourage this, but I know people will want to do it). For example:

struct OptionalDependent: ~Escapable {
  let dependent: AnyObject?

  init() dependsOn(immortal) {
    dependent = nil
  }
}

Or, for a more interesting case:

struct IntBuffer: ~Escapable {
  var buffer: UnsafeBufferPointer<Int>

  init() dependsOn(immortal) {
    buffer = UnsafeBufferPointer<Int>(start: nil, count: 0)
  }
}

The question is whether we should add a special case for initializers that take no arguments so that immortal dependence is inferred. On one hand, it's the only thing that would actually compile. On the other hand, initializing a nonescapable thing without providing the parent is not an encouraged pattern, and may indicate a design error. The initializer's implementation will be diagnosed to ensure that, at the end of the initializer, self does not actually depend on some other nonescapable (and nonimmortal) value. But Self likely only has escapable stored properties, in which case, the compiler just assumes it is safe.

I lean toward making this special case explicit to ensure that the library author really intended an immortal dependence. If this is too onerous, we could add the inference later. But the diagnostic that tells the library author that they need to pass in the source of the dependence is direct and useful in communicating what the compiler expects.

Michael_Ilseman · June 5, 2024, 3:26pm

"Immortal dependence" seems more like a "non-dependence", i.e. dependsOn(nothing) reads closer to the programming model than dependsOn(immortal).

let a: Array<Int>
let ref1 = a.span() // ref1 cannot outlive a
let ref2 = ref1.drop(4) // ref2 also cannot outlive a

After ref1.drop(4), the lifetime of ref2 does not depend on ref1. Rather, ref2 has inherited or copied ref1’s dependency on the lifetime of a.

I think it would be helpful to mention that ref1 is killed by the call to drop. That helps to explain why copied dependency is necessary and why a scoped dependency on a consuming argument is illegal. (While I am familiar with the consuming modifier, I did have to puzzle through this reasoning a little bit myself.)

I think it would also be helpful to mention why one would ever want scoped instead of copied lifetime.

init(arg: <parameter-convention> ArgType) -> dependsOn(arg) Self {
  ...
}

This syntax seems odd, as we syntactically assign to self and never syntactically return self (even if that's semantically equivalent). Would this work instead?

init(arg: <parameter-convention> ArgType) dependsOn(arg) {
  ...
}

func mayReassign(span: dependsOn(a) inout [Int], to a: [Int]) {
  span = a.span()
}

Should the type of span be [Int] or Span<Int>?

The new function argument dependence is additive, because the call does not guarantee reassignment. Instead, passing the 'inout' argument is like a conditional reassignment. After the function call, the dependent argument carries both lifetime dependencies.

It would be helpful to explain a little more what it means to carry both lifetime dependencies. I think the meaning is that the dependent cannot outlive either dependee, and from this perspective its lifetime is the intersection of the dependee's lifetimes. However, I think what would happen is that each dependee's lifetime would be extended if necessary (or possible) and thus the dependent's lifetime would be more like the union of each dependee's.

    struct Container<Element>: ~Escapable {

Do you need to say Element: ~Escapable, that is Element may or may not be escapable?

extension Storage {
  public func withUnsafeBufferPointer<R>(
    _ body: (UnsafeBufferPointer<Element>) throws -> R
  ) rethrows -> R {
    withExtendedLifetime (self) { ... }
  }
}

let storage = Storage(...)
storage.withUnsafeBufferPointer { buffer in
  let span = Span(unsafeBaseAddress: buffer.baseAddress!, count: buffer.count)
  decode(span!) // ✅ Safe: 'buffer' is always valid within the closure.

It is a little unfortunate that "unsafe" needs to appear in the argument label in this particular use, since this is safe for any code that follows Swift's strong precedent of ensuring closure pointer argument validity until the end of scope. That is, the Span safely depends on the closure scope in which it is constructed (alternatively: the syntactic scope of the value passed in).

However, since this API precedent is not statically enforced, and this is specifically for adapter code between the previous unsafe world and the new safe world, I think the proposal is acceptable.

MemoryLayout will suppress the escapable constraint on its generic parameter.

What about Custom(Debug)StringConvertible, so that non-escapable types can pretty-print themselves to the console?

struct OwnedSpan<T>: ~Copyable & ~Escapable{
  let owner: any ~Copyable
  let span: dependsOn(scope owner) Span<T>

  init(owner: consuming any ~Copyable, span: dependsOn(scope owner) Span<T>) -> dependsOn(scoped owner) Self {
    self.owner = owner
    self.span = span
  }
}

func arrayToOwnedSpan<T>(a: consuming [T]) -> OwnedSpan<T> {
  OwnedSpan(owner: a, span: a.span())
}

This is an interesting future direction. Wouldn't OwnedSpan be escapable but non-copyable, because its dependent-lifetime member is coupled with the dependee and thus they can be moved around together?

tbkka · June 5, 2024, 4:26pm

Note that our model here does not specify a particular lifetime. Rather, @dependsOn specifies a bound or constraint on the lifetime of an object.

This answers both of your concerns above:

Copying a lifetime constraint requires that there already be a constraint. So when obtaining a span from an array, for instance, you must create a new constraint (the span cannot outlive the array) since the array does not already carry any constraint that can be copied. A "scoped lifetime" is precisely a new constraint on the lifetime of an object.
In working through this design, we realized that in most cases, there is no real choice about whether to create a new constraint or copy an existing one, so the notation does not usually require you to specify. There is precisely one case with any possible ambiguity, and we've provided the option to specify in this case in order to be sure we're covering every possible option. But I've only managed to come up with one somewhat tortured example where this might be appropriate: a Span-like construct that stores partial copies of the backing data may require a sub-span to carry a new constraint on the parent span rather than inheriting (copying) the constraint on the original collection.
Having multiple lifetime constraints means that your lifetime is bound by more than one other object. That is, you cannot outlive either one.

Again, the @dependsOn notations do not specify the lifetime of an object -- they specify constraints that the optimizer must obey as it rearranges the code.

Michael_Ilseman · June 5, 2024, 6:19pm

extension Span {
  mutating selfDependsOn(other) func reassign(other: Span<T>) { ... }
}

Would this be better written as:

extension Span {
  mutating func reassign(other: Span<T>) dependsOn(other) { ... }
}

As mutating functions, similar to inits, mutate/reassign self?

ksluder · June 5, 2024, 8:22pm

How does the dependsOn syntax work for tuple returns? Does the entire tuple depend on a single lifetime, or can each element of the tuple have its own lifetime? What about a tuple that combines a ~Escapable type with a BitwiseCopyable type, like func dup(_ handle: FileHandle) -> (FileHandle, Error)?

Andrew_Trick · June 9, 2024, 3:10pm

See the "Immortal requirements" update.

It's true that the return value has no lifetime dependence in the calling function.

The implementation of a function that returns something with dependsOn(immortal) requires the programmer to compose the result from something that, in fact, has an immortal lifetime:

init() dependsOn(immortal) {
  self.value = <global constant>
}

It is up to the programmer to ensure that <global constant> is valid over the entire program.

Either syntax matches the implementation of Optional. A nil literally could just as easily be considered a global constant or an empty value.

dependsOn(immortal) is not a way to suppress dependence in cases where the source value has unknown lifetime. Composing the result from a transient value, such as an UnsafePointer, is incorrect:

init(pointer: UnsafePointer<T>) dependsOn(immortal) {
  self.value = pointer // 🛑 incorrect
}

We could run into the same problem with any transient value, like a file descriptor, or even a class object:

init() dependsOn(immortal) {
  self.value = Object() // 🛑 incorrect
}

We're relying on the syntax to make this distinction, regardless of whether the compiler can diagnose these errors.

As long as these semantics are clear, I'm happy to pick any keyword that people are more comfortable with.

We do need to decide whether the burden is on the compiler to prove immortality.

The dependsOn(immortal) handles cases in which, previously, we used an unsafe annotation. Without more compiler work, the safety burden will be on the programmer, and gradually diagnostics can improve to catch more obvious errors.

We could place to burden on the compiler from the start, but that will severely limit the programming model. You won't be able to write:

init() dependsOn(immortal) {
  self.value = getGlobalConstant() // 🛑 ERROR?
}

getGlobalConstant returns a regular, escapable value, and the compiler has no way of identifying it as immortal.

Andrew_Trick · June 9, 2024, 3:13pm

Michael_Ilseman:

Would this be better written as:
extension Span {
  mutating func reassign(other: Span<T>) dependsOn(other) { ... }
}
As mutating functions, similar to inits, mutate/reassign self?

The general form of the dependsOn syntax should be thought of as:

dependsOn(target.component: source.component)

target can always be inferred from context:

Parameter modifiers go before the parameter type
Result modifiers go before the result type (after the -> sigil)
self modifiers always go in front of the func declaration.

Example using a future syntax where we have component lifetimes:

  struct S: ~Escapable {
    @lifetime
    let a: T

    dependsOn(self.a: arg1) func foo(arg1: dependsOn(a: arg2) S, arg2: T) -> dependsOn(a: arg2) S
  }

I propose that we require an explicit self label even though it can be inferred. Otherwise, it's too easy to get confused with the far more common (bit still rare) case of a result dependence.

The typical case:

  func foo<T, R>(arg1: T) -> dependsOn(arg1) R

is too similar to the extemely rare case:

  dependsOn(arg1) func foo<T, R>(arg1: T) -> R

The programming model for initializers is that they return self (with an implicit return statement):

init(arg: ArgType) -> dependsOn(arg) Self

But this seems to confuse most people who prefer to think of an initializer as mutating self, which would be spelled:

dependsOn(self: arg) init(arg: ArgType)

See "Initializer syntax: result vs. inout syntax".

Andrew_Trick · June 9, 2024, 3:19pm

Every nonescapable element in the tuple depends on the source. Any escapable element in the type has no lifetime dependence. The same is true for struct properties. BitwiseCopyable types are normally escapable, but it's possible to declare BitwiseCopyable & ~Escapable, in which case they can be either the source or target of a lifetime dependene like any other nonescapable type.

In future directions:

It should be possible to return a tuple where one part has a lifetime dependency.
For example:
```swift
func f(a: A, b: B) -> (dependsOn(a) C, B)

We expect to address this in the near future in a separate proposal.

jlukas · June 9, 2024, 4:20pm

Can we write it as immortal(unsafe) or something when the compiler can’t prove that it’s safe, so that immortal can be compiler-checked (now or in the future)?

Andrew_Trick · June 10, 2024, 9:55pm

We could provide a function annotation to disable checking. That deserves to be an alternative considered. See See dependsOn(unchecked) to disable lifetime dependence checking.

If we want strict safety in our initial design, even for the immortal annotation, then we should probably rely on the compiler to fully check all the components for the nonescapable result. That will require the function's implementation to use unsafeLifetime in some cases:

init() dependsOn(immortal) {
  self.value = getGlobalConstant() // OK: unchecked dependence.
  self = unsafeLifetime(dependent: self, dependsOn: ())
}

I think it makes sense to confine the unsafe code to the implementation. The client only sees that they're getting an immortal value. There's no need to annotate the API as unchecked.

Andrew_Trick · June 22, 2024, 2:03am

Clarification: we do plan to allow this without any unsafe annotation:

struct NE: ~Escapable {
  init() dependsOn(immortal) {
    self.value = getGlobalConstant() // ✅ makes a copy of the constant value. No need to check its lifetime.
  }
}

More generally, when you assign into an escapable property of a nonescapable type, the compiler makes a copy of the escapable value. That copy's lifetime is considered independent from the lifetime of the value returned by the right-hand side of the assignment. You will see the same behavior with mutating methods:

struct NE: ~Escapable {
  // For mutating methods, the outgoing value of 'self' implicitly depends on the incoming value of 'self'.
  // There is no dependence on the 'value' parameter.
  mutating func setValue(value: T) {
    self.value = value // ✅ makes a copy of 'value' with no dependence on the parameter.
  }
}

grynspan · June 22, 2024, 6:55pm

This makes the second name of an argument part of the ABI (or at least API) for a function, when previously it was invisible at the call site.

Andrew_Trick · June 22, 2024, 8:58pm

The local parameter name does need to be printed in the API's interface. But I think that's already true. And, unless I'm mistaken, changing the local parameter name still does not affect the API or ABI. The only thing that should matter to the client is the position of the argument that the dependence is on.