`borrow` and `take` parameter ownership modifiers

John_McCall · August 18, 2022, 9:20pm

Right. This non-escaping property is essentially a simplified form of what Rust expresses with lifetime qualifiers.

Alvae · August 18, 2022, 9:41pm

That's an interesting question. The short answer is yes, because set parameters can describe what happens in subscripts and because they are more expressive than return values.

We wanted a way to express the convention of a receiver in a subscript that just assigns a new value without having to produce one. Below, degrees is a computed property with 3 different accessors. radians is a set parameter in the context of the last one.

type Angle {
  var radians: Double
  
  property degrees: Double {
    let {
      radians * 180.0 / Double.pi
    }
    inout {
      var d = radians * 180.0 / Double.pi
      yield &d
      radians = d * Double.pi / 180.0
    }
    set(new_value) {
      radians = new_value * Double.pi / 180.0
    }
  }
}

Note that Val also allows subscripts to be defined without one specific receiver. Then you may need multiple "out" parameters.

Another more theoretically-motivated argument is that set parameters complete the language calculus. They provide a way to express initializers as what they truly are: functions that accept a piece of uninitialized memory and initialize it. So you can separate allocation from initialization.

fun main() {
  let p = MutablePointer<T>.allocate(count: 1)
  T.init(&p.unsafe[0])
  p.unsafe[0].deinit()
  p.deallocate()
}

Joe_Groff · August 18, 2022, 10:57pm

Your description of set suggests that the parameter transitions from uninitialized to initialized, but wouldn't radians already have a valid value when you set degrees?

Alvae:

Another more theoretically-motivated argument is that set parameters complete the language calculus. They provide a way to express initializers as what they truly are: functions that accept a piece of uninitialized memory and initialize it. So you can separate allocation from initialization.
fun main() {
  let p = MutablePointer<T>.allocate(count: 1)
  T.init(&p.unsafe[0])
  p.unsafe[0].deinit()
  p.deallocate()
}

I don't think you need a new kind of thing for that. In Swift, every function, initializer or not, accepts one or more pieces of uninitialized memory and initializes them with its return value(s). Initializers differ only because they name the return slot self, and have some additional knowledge of the physical layout of the type so that they can ensure that all stored properties of the type are initialized before returning. When you write var foo = T() in Swift, the memory for foo is allocated separately by the caller, and then the pointer to foo is passed to T() to be initialized in place. (For smaller types that fit in a handful of registers, the value gets returned in registers and stored in foo's memory by the caller, but that's not visible at the language level, and foo will likely be promoted to registers in the caller itself.)

Joe_Groff · August 18, 2022, 11:49pm

I spoke with the language working group to get a feel for their opinion on modifying the ownership convention of self, and how we might approach that possibly given a more general way of declaring self as an explicit parameter and using standard parameter modifiers. The language working group was interested in exploring the design space of explicit self declarations, but felt that even if we had that functionality, that it would still be worthwhile to have method-level modifiers to specify the ownership of self. To start discussion there, how about taking func and borrowing func?

dabrahams · August 19, 2022, 12:05am

I'm trying to follow here. I'll put my reasoning in bullets; please correct me if I go astray. IIUC:

You are saying that if borrowed values can be copied, a value of copyable type can always escape: just store/return a copy. A borrow in such a system would not prevent the value from escaping. I agree (and I don't think we want to consider a system where borrowed values can't be copied).
Closure values with inout captures mustn't escape the scope of their declarations (i.e. they are non-escapable), and the existence of an inout capture is not represented in the type system, so you can't statically prevent it from being copied.
Therefore, I'm guessing, the logic is that one must not allow a closure with inout captures to be passed by borrow, just like they can't currently be passed by inout.
Therefore, if you want to be able to pass such a closure to a function at all (which we obviously need to be able to do) it has to be by-value, and the by-value convention needs to be distinct from borrow, and not allow closure parameters to escape (by default).
Finally, you need @escaping annotations so that escapable passed-by-value closures can escape.

I hope I got that right. I understand the logic, but IMO there are other approaches that allow you end up with a much simpler design overall, one where there's no distinction between borrow and by-value, and @escaping disappears. You “just” need to define the copy semantics of closures with inout captures. In order to do that, you probably can't allow inout captures of non-copyable types in plain copyable closures (I'd argue that conferring reference semantics on a non-copyable type is almost always problematic anyway). I have a copy semantics in mind, but don't want to dive into that yet.

michelf · August 19, 2022, 1:32am

That makes a lot more sense to me than a special syntax with an explicit self: take Self parameter. It's also obvious they're related to take and borrow, and it sort of follows the precedent of having an -ing where inout becomes mutating once applied to self.

To get back to an older topic, I think it would make sense for take parameters to be mutable inside the function. It would encourages the correct use of take.

I feel linking local mutability to take/taking encourages the best ownership choice for the function's implementation: if you need to mutate internally to the function: use take; externally: use inout; no mutation: use borrow. That might be a bit simplified, but it helps understand the difference between the three and choose the right one depending on what you're doing. In addition, it aligns the need to make a copy to the need for rebinding to a new variable if you need to mutate a borrow value, which feels right.

The downside is that people might be hesitant to use take if they have no need to mutate the value locally, even though other circumstances would make it beneficial (like storing it somewhere it outlives the current scope). We could add another modifier to differentiate between take (var take) and escape (let take) to alleviate this hesitancy, but I feel that would be overcomplicating things.

plorenzi · August 19, 2022, 11:01pm

take seems a little generic, where we read it at first the intent of the keyword is not clear. As @allenh said, functions always take arguments.

We might say keep instead:

let f: (keep Foo) -> Void = { a in a.foo() }

It emphasizes that the function will now own it and use it as it pleases.

Alvae · August 19, 2022, 11:08pm

Sorry, I am failing to make the larger point about the idea behind set parameters. Let's backup for a second and think about inout. In a language with value semantics, inout parameters allow us to express in-place mutation across function boundaries.

inout is a simple yet very powerful concept: value goes in and then it goes out. So it makes sense to apply the same principle to other features. For instance, when we revisited subscripts in Val, we thought that it made sense to use inout to represent _modify accessors. Value goes to the caller and then it goes back to the callee. Strictly speaking it still goes "in and out" if we consider inversion of control.

Now we can draw a parallel with what I'm trying to achieve with set parameters. I want to describe assignment across function boundaries as a language-level concept. A set parameter is like inout, except that no value needs to go in.

Assignment and initialization are closely related concepts. At a high-level, initialization is just assignment where there isn't a value yet (there's a catch with that reasoning, but I'll get to it). That is why I argue it makes sense to also use set parameters to represent initialization across function boundaries.

Again, we can generalize set parameters as a concept and see where else it can apply. It turns out that a set accessor in Swift is a good match, as the value to be set does only go in (again, considering inversion of control). That has an obvious advantage in terms of performance: we do not have to synthesize the value that goes out.

I concede that this whole reasoning is largely motivated by theory. While set accessors in subscripts will probably be very common, set parameters in functions will surely be less so That said, if there are use cases for placement new in C++, I believe there will be use cases for set parameters in Val. What's very interesting to me is that set parameters complete the calculus in a coherent way. They let us express assignment and initialization across function boundaries as a language-level concept, without having to know about the underlying implementation.

With that perspective in mind, let's get to your comments.

You are correct; my description was lacking. In a subscript, set parameters relate to assignment rather than initialization because self must necessarily be initialized before the subscript is called.

You put your finger on a point that I'm still trying to iron out in Val. When do set parameters relate to assignment rather than initialization? The answer is clear for subscripts as their value must already exist, as you pointed out, or be synthesizable. But what about functions?

A simple escape hatch is to say that the compiler outright rejects programs that attempt to pass initialized arguments to set parameters. But that takes an optimization opportunity away from us: what if we want to reuse part of the storage of the notional LHS of the assignment?

We have started exploring a way to handle that specific optimization. An assignment a = b is translated by the compiler as b.move(into: &a), where move(into:) is a customizable method of the LHS. I think it would be interested to generalize that translation so that it applies to set parameters too. But that's a Val design question. I don't want to pollute this thread.

I believe set parameters express at the language level what Swift does under the cover. I do not claim it is strictly necessary to surface the mechanism that initializes return values. It is a quite low-level feature with niche use cases. Another argument is that giving the compiler full control over initialization yields more opportunities for optimizations.

I expect both of these observations to hold in Val. We'll write let x = T() in 99% of the cases and we'll let the compiler implement x's initialization as it sees fit. But I think set parameters are valuable nonetheless from practical, educational, and theoretical point of views. I base that belief on the fact that there are use cases for placement new in C++, that set parameters provide an interesting way to understand the type of an initializer, and that they seem to fit very well in the calculus we have designed for Val.

dabrahams · August 22, 2022, 11:28pm

Voilà: Template for a possible future object model

Joe_Groff · August 26, 2022, 7:29pm

I've made a few updates to the draft PR in response to discussion so far:

Use taking and borrowing func modifiers to control self's convention
Add "alternative considered" for whether taken parameters should be bound
as mutable in the callee

Here is the diff:

and the most recent revision:

github.com

apple/swift-evolution/blob/a80c5f2bfdb194834ecfeeea2e2f1a223e3c8a70/proposals/NNNN-parameter-ownership-modifiers.md

# `borrow` and `take` parameter ownership modifiers

* Proposal: [SE-NNNN](NNNN-parameter-ownership-modifiers.md)
* Authors: [Michael Gottesman](https://github.com/gottesmm), [Joe Groff](https://github.com/jckarter)
* Review Manager: TBD
* Status: **Implemented**, using the internal names `__shared` and `__owned`
* Pitch v1: [https://github.com/gottesmm/swift-evolution/blob/consuming-nonconsuming-pitch-v1/proposals/000b-consuming-nonconsuming.md](https://github.com/gottesmm/swift-evolution/blob/consuming-nonconsuming-pitch-v1/proposals/000b-consuming-nonconsuming.md)

<!--
*During the review process, add the following fields as needed:*

* Implementation: [apple/swift#NNNNN](https://github.com/apple/swift/pull/NNNNN) or [apple/swift-evolution-staging#NNNNN](https://github.com/apple/swift-evolution-staging/pull/NNNNN)
* Decision Notes: [Rationale](https://forums.swift.org/), [Additional Commentary](https://forums.swift.org/)
* Bugs: [SR-NNNN](https://bugs.swift.org/browse/SR-NNNN), [SR-MMMM](https://bugs.swift.org/browse/SR-MMMM)
* Previous Revision: [1](https://github.com/apple/swift-evolution/blob/...commit-ID.../proposals/NNNN-filename.md)
* Previous Proposal: [SE-XXXX](XXXX-filename.md)
-->

## Introduction

This file has been truncated. show original

Nobody1707 · August 26, 2022, 8:40pm

+1 This seems good to me. taking and borrowing matches our current conventions, and explicit self is purely additive so we can always revisit that if self modifiers start getting out of hand. borrow is a little verbose as an operator, but a @NoImplicitCopies attribute would probably be used in most places were borrow would come up a lot.

jrose · September 4, 2022, 6:43pm

I was going to save this for a later thread, but it dovetails with the notion of sink functions from the Val thread, and I've talked about it privately with a few people, so here we go:

There are three things you might want to do in a taking method:

pass self along to another taking function (e.g. MockableNetwork.install could move self into a static shared variable)
let self be destroyed normally (an explicit OpenFile.close could just be equivalent to _ = take self)
take an alternate path in destroying self (OwnedBuffer.intoUnsafe would need to not deallocate its buffer after being called)

(1) and (2) are effectively equivalent to how local move-only values would be handled anyway—destroy the value unless it is taken—but (3) is special. In this case the function needs to be able to access the fields of self, the stored properties specifically, and may want to take them individually.

In C++, move operations must always leave the object in a valid state for the destructor to be called, which is not ideal. In Rust you can do this by destructuring the object with irrefutable pattern-matching, which works but which I find somewhat subtle. But we already have a kind of function that behaves differently whether you act field-by-field or on the whole object: initializers. We could do the same thing for taking functions by having the compiler check whether any fields have been taken, and if so you must explicitly take all of them. Or all non-copyable fields, perhaps. To make this safer, we should require you to explicitly say take self if you still want the normal deinitializer to run; otherwise it's too easy to mess this up (especially for a type that perhaps has no non-copyable fields, or perhaps no fields at all).

I've been calling this deinit func in my head, but given use case (1) I don't actually think that's a great spelling. I had also previously been thinking that you could have taking func for (1) and (2) and deinit func for (3), but that really doesn't convey the right impression; on the caller's side they're equivalent. But I do think it's important to support all three use cases; otherwise we'll be back in C++ land with "valid for destruction" state.

Joe_Groff · September 4, 2022, 7:37pm

jrose:

In C++, move operations must always leave the object in a valid state for the destructor to be called, which is not ideal. In Rust you can do this by destructuring the object with irrefutable pattern-matching, which works but which I find somewhat subtle. But we already have a kind of function that behaves differently whether you act field-by-field or on the whole object: initializers. We could do the same thing for taking functions by having the compiler check whether any fields have been taken, and if so you must explicitly take all of them. Or all non-copyable fields, perhaps. To make this safer, we should require you to explicitly say take self if you still want the normal deinitializer to run; otherwise it's too easy to mess this up (especially for a type that perhaps has no non-copyable fields, or perhaps no fields at all).

I recall having a discussion with the Rust developers about their experience exploring this field, and they found that implicitly running the destructor ("affine" types rather than "linear" types that require explicit consumption) was the most ergonomic choice, with mem::forget for the cases where you have a consuming operation that supersedes the normal destructor. It'd be interesting to look at how common forget is in practice to see how well their design holds up.

Another tack someone might take here is to allow for the existence of private deinit {} that is inaccessible outside of the type, which would require all uses of the type to pick an explicit consuming operation to dispose of them, if the type doesn't have a single canonical destructor that makes sense as the default.

jrose · September 4, 2022, 8:21pm

That makes sense as a default, but you can’t combine mem::forget with destructuring. My point is that Swift needs to have the equivalent to Rust’s destructuring in order to implement the equivalent of OwnedBuffer.intoUnsafe without relying on UnsafeMutableBuffer being copyable. Admittedly I can’t think of a real-world example for that right now, so maybe it can be deferred if we have a mem::forget equivalent, but…

Karl · September 5, 2022, 6:07am

Would we allow functions to have taking overloads? I can't find anything about overloads in the proposal.

The use-case I'm thinking of is something like Collection.map, where self is an Array:

let x: [Int] = ...
let y = x.map { ... }

If there are no deinit barriers on x after the call to map (i.e. x is no longer used), the compiler would be able to select a taking func overload instead - so rather than self being a regular borrow, we would get an owned value. If the buffer is uniquely referenced within the taking func, we can infer that it is about to be discarded (there is one reference, and we own it), and could re-use the allocation.

This might be clearer if we used the term "deinit func" for this, as Jordan suggested. So this would be "a map function which also deinits self" - which sounds like a strange mix, but considering what it does, it makes sense.

Why not just make the regular map function a taking func? We could (assuming the ABI change could be worked-around), but if there are any deinit barriers (i.e. x is used again later in the function), the caller would need to perform a copy (retain) before calling map, in order to prevent this allocation reuse.

Since we know about those deinit barriers at compile time, we could instead just select a regular +0 (borrowing) overload that doesn't even attempt it.

filip-sakel · September 5, 2022, 7:22am

Under this model, taking/consuming functions would be available and deinit would just be a mandatory consuming ‘method’. I think this makes a lot of sense but it doesn’t really match with what initializers do. Thus, I think we could do what Val did with set bindings and have initializers be set functions of self. Other use cases of set are also presented in the Val thread, so I think that taking all of Val’s bindings: inout, sink, set, let would be great for Swift too.

Joe_Groff · September 5, 2022, 10:29pm

We currently mangle "__owned" and "__shared" distinctly, but we don't allow overloading, since there would be no way to disambiguate which one to favor at a call site. I could see it being useful to overload along with use of @usableFromInline internal and @_disfavoredOverload could however be a way for an ABI-stable library to adjust the favored calling convention for new clients, while maintaining an entry point for existing clients with the old calling convention.

Initializers don't really have any outwardly special behavior in Swift; a static func that returns Self is outwardly identical to an initializer except for syntax. And similarly, a deinitializer isn't really outwardly different from a taking func or what Jordan calls a deinit func. The special ability both initializers and deinitializers have is visibility into the layout of the type, and the corresponding ability to initialize or deinitialize fields one at a time. Even with that internal power, initializers still externally go from zero to a fully-initialized value, like all functions returning Self do, and struct deinitializers would go from a fully-initialized value back to zero, like all taking funcs do.

Andrew_Trick · September 6, 2022, 1:17am

We should be able to deinitialize one field at a time anywhere we have ownership of the aggregate:

func taking(ab: take AB) -> (A, B) {
  return (take ab.a, take ab.b)
}

As long as the aggregate isn't accessed in between the "partial takes", I don't see a problem. We can add pattern-matching destructures later.

This only gets interesting in a future with default deinit methods on aggregates types. If the type defines a default deinit, then partial takes should only be allowed within some method designated as a deinitializer. Those designated deinits could have return values for destructuring the value.

I don't see a way to force programmers to explicitly call a deinit method though. In a generic context, you'll need to fall back to the default deinit.

jrose · September 6, 2022, 1:47am

I was thinking every non-copyable type has to be treated as “defining a default deinit”, but sure, if you have a frozen non-copyable type you could allow arbitrary destructuring.

Karl · September 6, 2022, 1:14pm

You could choose which overload to apply:

// Always uses the 'taking' overload, since 'x' cannot be used again.
(take x).map { ... }

// Always uses the default/borrowing overload, since we add a use of 'x'.
x.map { ... }
withExtendedLifetime(x) { _ in }

I did wonder about ABI-stable libraries being able to change the calling convention. The proposal seems to indicate that it isn't possible; it only says:

take or borrow affects the ABI-level calling convention and cannot be changed without breaking ABI-stable libraries.

So perhaps it would be worth mentioning that libraries could transition from one to the other.

Beyond that, my point was that there can be value in having both conventions available, with neither being disfavoured, and for overload resolution to statically pick one based on context. Of course, the compiler can perform an "implicit copy" to call a taking function even when it is not actually ready to relinquish ownership, but I think the goal of these proposals is to allow libraries and their clients to avoid those copies. By having both variants available, I think we can do that.

A future direction, perhaps?