SE-0366: Move Function + "Use After Move" Diagnostic

ksluder · July 27, 2022, 11:31pm

One benefit to a move/__moved parameter modifier is that it provides an obvious spelling for initializers for move-only wrapper types:

/// A stdlib-provided replacement for all the adhoc @unchecked Sendable wrappers people are writing
struct Envelope<T> : MoveOnly, Sendable {
  var wrappedValue: T
  init(wrapping value: move T) {
    wrappedValue = move(value)
  }
}

(certain other details of this implementation assumed/omitted)

ksluder · July 28, 2022, 12:01am

I’m unclear on how __owned and __consuming relate to move(_:). Do they all deal with the same kind of ownership? Or do __owned and __consuming refer to ARC ownership, while move(_:) refers to a new kind of ownership that’s tracked by flow analysis? If they’re the same thing, then I guess my move parameter modifier suggestion above is just a formalization of __consuming?

My brain keeps coming back to how move(_:) interacts with captures. My understanding at this point is that this is disallowed:

func f() {
  let x: Int = 0
  DispatchQueue.main.async {
    print(x) // error: cannot capture 'x' because it is moved from later
  }
  if Bool.random() {
    let _ = move(x)
  }
}

which is safer and easier to understand than C++. But I can also imagine a situation in which being able to capture a moved value might be useful: Task cancellation.

func doSomething(with arg: Arg) async {
  withTaskCancellationHandler {
    return algorithmGuts(move(&arg))
  } onCancel: {
    // Presumably the compiler prohibits this to avoid a race with `move(arg)` above?
    print("Cancelled request \(arg)")
  }
}

If we had a way to move a value but leave a valid marker behind, we could do this safely:

extension Optional {
  /// Returns the wrapped value, if any, replacing it with `nil`.
  ///
  /// Returns `nil` if this optional is already `nil`.
  mutating func moveOut() -> Wrapped?

  /// Like `moveOut()`, but the swap with `nil` is done atomically.
  ///
  /// This version is slower and only available on platforms with atomic swap instructions.
  mutating func atomicMoveOut() -> Wrapped?
}

func doSomething(with arg: Arg) async {
  var localCopy: Arg? = arg
  withTaskCancellationHandler {
    // The cancellation handler doesn’t move from `localCopy`, so we know it’s always non-nil.
    let interiorCopy = localCopy.atomicMoveOut().unsafelyUnwrapped
    return algorithmGuts(move(interiorCopy))
  } onCancel: {
    if let arg = localCopy {
      print("Cancelled request \(localCopy)")
    } else {
      print("Cancelled too late!")
    }
  }

beccadax · July 28, 2022, 12:24am

To start:

__consuming just means that self is __owned, in much the same way that mutating means self is inout.
When I talk about “copying” below, think “retain” for an object; when I talk about “destroying”, think “release”.

__owned means that the callee will destroy the value, so the caller should consider it unusable once the callee returns. The caller can handle this by either copying the value before passing it (the default) or ending its lifetime so it can no longer be used (what happens if you use move).

The alternative is that the callee will not destroy the value, so the caller can keep using it once the callee returns. This means that, if there’s a copy, it will be in the callee, not the caller. The caller can still use move, but it won’t actually eliminate the copy; the callee will still copy the original, but the caller will destroy the original once the callee returns.

So basically, __owned means that if you do move the value in, that will truly eliminate a copy. But using __owned by itself doesn’t eliminate the copy; it just allows the caller to eliminate it.

tera · July 28, 2022, 12:54am

Sorry to bump this again (link). Perhaps I'm just missing some killer example that shows the superiority of "goto-style approach" compared to more structural alternatives (like nesting).

ksluder · July 28, 2022, 1:01am

Thanks, this is extremely helpful clarification.

Extrapolation to explicit-copy and move-only types

My argument is that by giving the power to a parameter modifier, we can have our cake and eat it too. move(_:) can be an honest-to-goodness function whose argument is tagged move.

And in that vein, I’ve started to think about how a copy parameter would also help. arg: copy T could do for copy(_:) what arg: move T does for move(_:). Both of them are new syntax to effectively support one function, but if we have to introduce highly specialized syntax somewhere, why not confine it to an attribute and have the top-level syntax fall out naturally?

Here’s where I’m at so far on move and copy:

Parameter attribute	Effect
`arg: T`	Argument is copied into parameter, unless T is move-only, in which case it is moved. If T is explicit-copy, the caller must call `copy(_:)` or `move(_:)`.
`arg: copy T`	Argument is copied into parameter, unless it is the result of calling a function with a `move` return type, in which case it is moved. If T is explicit-copy, it is copied via invoking its `copy` method. Cannot be used if T is move-only.
`arg: move T`	Argument is moved into parameter. Argument must either be the result of calling a function with a `move` return value, or it must be a variable prefixed with `&`.

The move keyword could also decorate a function return type, in which case it means the function returns its value via a new placement-return ABI, in which the caller allocates storage for the return value. This is akin to returning via an inout parameter, and is intended for two situations: tight loops and immediately passing the returned value to another function.

The Standard Library would use these new attributes to implement canonical move(_:) and copy(_:) functions:

/// Explicitly copies a value, returning the copy.
///
/// If T is explicit-copy, this function calls T.copy() to create the copy. T cannot be move-only.
func copy<T>(_ value: copy T) -> move T {
  // The `copy` attribute on the parameter does all the real work, effectively doing the following:
  // let value = T.self is ExplicitCopy.Type ? (value as! ExplicitCopy).copy() : value
  return value
}

/// Explicitly moves a value.
func move<T>(_ value: move T) -> move T {
  // The `move` attribute on the parameter does all the real work.
  return value
}

The interaction between move in return position and move or copy in argument position is what leads to the above table of behaviors.

Type of `func g()`	Type of `func f(_ arg: T)`	Result of `f(g())`
`() -> T`	`(T) -> Void`	`g` returns value by normal ABI, then caller prepares value to be passed to `f`.
`() -> T`	`(move T) -> Void`	Not allowed.
`() -> T`	`(copy T) -> Void`	Not allowed.
`() -> move T`	`(T) -> Void`	If `f`’s argument is not passed in registers, caller first prepares storage for passing argument to `f`, then calls `g` via placement-return API. `g` places returns value in prepared storage. Caller immediately invokes `f`.
`() -> move T`	`(move T) -> Void`	Same as above.
`() -> move T`	`(copy T) -> Void`	Since the lifetime of the return value ends when the temporary is discarded, this effectively acts the same as the above. This is what allows `g(copy(f()))` to work without recursively demanding `copy(f())` be wrapped in `copy(_:)`.

I mentioned ExplicitCopy in the pseudo-implementation of copy(_:) above. I think ExplicitCopy and MoveOnly should be true marker protocols, not @attributes. They would be mutually exclusive—a type could be ExplicitCopy, MoveOnly, or neither. ExplicitCopy would carry one requirement which would be auto-synthesized by default:

/// Marker protocol for types that cannot be copied.
///
/// A type cannot conform to both MoveOnly and ExplicitCopy.
protocol MoveOnly { }

/// Marker protocol for types that can be copied, but must be copied explicitly.
protocol ExplicitCopy {
    /// Implement this method if your type has any MoveOnly properties or if you need to modify either the source or the result.
    ///
    /// Swift synthesizes the implementation of this method for you if your type has no properties that conform to MoveOnly.
    /// The synthesized implementation allocates a new instance, and then initializes its properties by copying this object’s properties.
    /// If any of the properties conform to ExplicitCopy, this will result in their `copy` methods being called.
    ///
    /// If your type has any properties that conform to MoveOnly, you must implement this method yourself to initialize a copy with suitable values.
    ///
    /// Your implementation of this method can mutate self. For exmaple, your type might store its value inline until its copied, at which point it moves the value to a location shared with the copy.
    func copy() -> move Self
}

ksluder · July 28, 2022, 1:11am

To be clear, in this example:

let other = do {
	let x = ...
	useX(x)
	return x
} // `x` lifetime ended
// `x`  is moved to `other`

You are proposing new behavior for return inside a do block, right? That’s a pretty big change to existing syntax.

Jumhyn · July 28, 2022, 1:27am

This is more or less what I expected, and it’s not entirely satisfying to me—much of the motivation for move is that it aspires to ‘lock in’ lifetime behavior regardless of optimizer choices, but without making this transformation invalid, it seems like we’re still relying on the optimizer being ‘smart’.

I take your point that this seems like a silly optimization in isolation, but are we so sure that in the full complexity of a real-world program, such behavior wouldn’t emerge? Granted, we can’t guarantee that bugs won’t ever arise in the implementation of the compiler, but it sounds like this wouldn’t even be a bug in any formal sense. What would be the impact of a harder rule such as “a uniqueness check can’t be hoisted above deinitialization of a potentially aliasing reference which has been explicitly moved”? Do we think that would have too many false positives to be worth it?

ebg · July 28, 2022, 3:16pm

What does move solve that we can't address with allowing explicit retain/release calls for performance critical code? Am I missing something here? The solution seems very obvious from reading the proposals's motivation:

... in performance sensitive code, developers want to be able to control the uniqueness of COW data structures and reduce retain/release calls...

John_McCall · July 28, 2022, 3:27pm

Moving values around and requiring copies to be explicit is essentially as close as you can get to making retain/release calls explicit without completely giving up on memory safety.

Joe_Groff · July 28, 2022, 4:30pm

The proposal uses explicit __owned arguments to illustrate the interactions of passing values by move with a calling convention that consumes its arguments, but __owned isn't essential to the proposal. It is still useful to be able to shorten the lifetime of local values independent of function arguments, and to be able to move out of inout arguments and reinitialize them.However, even though __owned is not an official language feature, Swift still uses the consuming calling convention automatically in various situations. The default convention for initializers and setters is to consume their arguments, on the expectation that they are likely to use their argument in order to form part of the result value. Also, an argument is not annotated __shared, __owned, or inout can have its convention manipulated by the optimizer, if it sees that consuming or borrowing the argument contrary to the default convention opens up further ARC optimization opportunities.

If it helps, we can amend the proposal to avoid jumping ahead and remove references to __owned, leaving the interaction with move to be discussed when we formally propose shared and owned as part of the language. I think it's also worth discussing whether the proposed constraint disallowing move of non-__owned-annotated arguments is a good one, given that the convention of unannotated arguments is usually indefinite.

Paul_Cantrell · July 28, 2022, 5:57pm

I’m +1 on the behavior as proposed. Makes sense. Captures some subtle things in a way that’s relatively easier to get one’s head around than alternatives I’ve seen.

On the sticky question of naming and syntax, I found that @beccadax’s post, including what she called the “rambly bit,” matched my own thinking.

Spelling it as a function is troublesome, agreed. But I don’t find the operator spelling particularly better. And I don’t see other good alternatives.

A thought in favor of using the word “move,” as opposed to symbols: together with the words “create” and “copy,” it forms a consistent metaphor. In this metaphor:

Values are physical objects
Variables are fixed locations in space
Variables are are containers which can “hold” values ~~(although we don’t say a value is “in” a variable, so there’s a limit to how far the metaphor works)~~ Edit: OK, we don’t say “5 is in y,” but we do say “5 is stored in y,” which is pretty darned close.

I always like when terms of art keep metaphorical consistency: it aids learning, and forms a handhold of casual reasoning for those who don’t want to be neck-deep in implementation details.

(Note that in this metaphorical schema, “assignment” is the odd one out. If we could get a do-over on PL history, perhaps using the term “copy” instead of “assign” would have been better.)

Avi · July 28, 2022, 6:01pm

I disagree. I've heard and read variations of "the value stored in x" or the "the number 5 is stored in y" for almost 30 years.

Joe_Groff · July 28, 2022, 6:16pm

I suspect there's a cultural break here between functional and imperative traditions, where a variable is its value in the mathematical tradition, but more like a storage location you put things in in the imperative programming tradition.

Michael_Gottesman · July 28, 2022, 6:29pm

Just to provide an FYI, we are looking into posting a new version of this where we do the contextual keyword with move. @Joe_Groff is doing some editing/etc of the proposal with this in mind. With that in mind, lets focus the review on the semantics/less on the move function syntax for now. Joe will post here when the update is up.

tera · July 28, 2022, 9:52pm

For me it feels a smaller and simpler change (both to understand and to implement) compared to the pitch proposal.

Besides there are a couple of options available today, that allow ending variables lifetime early.

// 1. not so nice, but available now:
let other: T
do {
	let x = ...
	useX(x)
	other = x
} // `x` lifetime ended
// `x`  is moved to `other`

// 2. quite ugly, but available now:
let x = ...
useX(x)
let other = x  // `x`  is moved to `other`
guard let x = () as Void? else { fatalError() }  // previous `x` lifetime ended

// 3. available today, not so bad:
var other = { () -> T in
    let x = ...
    useX(x)
    return x
}()   // `x` lifetime ended
// `x`  is moved to `other`

// 4. might be available in the future (e.g. https://github.com/apple/swift/blob/main/userdocs/diagnostics/complex-closure-inference.md):

var other = {
    let x = ...
    useX(x)
    return x
}()  // `x` lifetime ended
// `x`  is moved to `other`

// 5. future ideal, listing for completeness:
var other = do {
	let x = ...
	useX(x)
	return x
} // `x` lifetime ended
// `x`  is moved to `other`

By a way of analogy, it feels like having a programming language that only has a higher level "forEach" statement, and as we sometimes need a lower level alternative we are now having a discussion on introducing a "goto" statement without considering "while" and "switch" statements first, which might be just enough for a typical task at hand. (And remember, even if goto can do much more compared to what "while"/"switch" can - goto is still considered evil and we don't have it in modern programming languages.)

I'd be very happy to be proven wrong and see some killer use cases that show superiority of "move" approach compared to nesting, just what I've seen so far (e.g. in the pitch description) doesn't look like a killer use case example.

Joe_Groff · July 29, 2022, 3:53am

I'd like to offer the following revisions in response to the discussion so far:

github.com/apple/swift-evolution

Revisions in response to review feedback

apple:main ← jckarter:move-review-revisions

opened 03:52AM - 29 Jul 22 UTC

jckarter

+142 -97

- `move x` is now proposed as a contextual keyword, instead of a magic function … `move(x)`. - The proposal no longer mentions `__owned` or `__shared` parameters, which are currently an experimental language feature, and leaves discussion of them as a future direction. `move x` is allowed to be used on all function parameters. - `move x` is allowed as a statement on its own, ignoring the return value, to release the current value of `x` without forwarding ownership without explicitly assigning `_ = move x`.

move x is now proposed as a contextual keyword, instead of a magic function
move(x).
The proposal no longer mentions __owned or __shared parameters, which
are currently an experimental language feature, and leaves discussion of them
as a future direction. move x is allowed to be used on all function
parameters.
move x is allowed as a statement on its own, ignoring the return value,
to release the current value of x without forwarding ownership without
explicitly assigning _ = move x.

ksluder · July 29, 2022, 5:00am

Does this result in a parse ambiguity? What’s the type of { x in move x }?

If the only reason for this concession is to avoid _ = move, I don’t think that’s a strong motivation._ = «expr» is idiomatic Swift. If there’s a more fundamental reason a non-expression version of move is needed, then I suggest the spelling drop x.

Nikolozi · July 29, 2022, 5:27am

I'm happy to see this change. I think, a contextual keyword makes more sense. In the roadmap post, yield is written as yield _x and not as yield (_x). So, to me, move x looks consistent with it.

Also, in the roadmap, there was this bit of code:

I'm wondering about the copy() function. Is this going to be a feature, complementary to move, or is this just for the sample code, to indicate that x will be explicitly copied before passing it into the function? If it's going to be a feature, I guess it would make sense for it to be also spelled as copy x?

xwu · July 29, 2022, 12:32pm

We already encounter this scenario with @discardableResult functions used with implicit return.

I'd expect { x in move x } to have the same type as { x in foo(x) } where the function is declared @discardableResult func foo<T>(_: T) -> T.

xwu · July 29, 2022, 12:45pm

The revision states:

move x + y // Parses as (move x) + y

I wonder if there could be some more justification of this choice, as it behaves differently from try, etc.

Might it be preferred to have move x + y parse as move (x + y), particularly since there would be issues using explicit parens due to ambiguity with hypothetical functions named move? Users could specify (move x) + y explicitly if that's what they want.

Alternatively, is there room to make move have undefined precedence with standard operators and therefore always require parens?