Design concerns for borrowing and inout pattern matching

Joe_Groff · November 10, 2023, 11:44pm

Hi everyone. SE-0390 left pattern matching as a consuming operation. For noncopyable enums, that makes it impossible to access their payloads without also destroying the value, which is obviously insufficient. Even for copyable types, we’ve also long lacked the ability to modify an enum’s payload in place. It’s time to start talking about how to generalize pattern matching to support borrowing and inout pattern matches, allowing for values to be matched without copying or consuming them, and also allowing for in-place mutation of enums during pattern matching. Before getting too deep into the design and implementation, I wanted to get some feedback about a few design points:

the introduction of new binding patterns, borrowing x and inout x, for borrowing and mutating the matched part of the value respectively
establishing the ownership behavior of various patterns
how to syntactically distinguish borrowing, consuming, and mutating pattern matches (if at all)
how to extend related conditionals, such as if let and if case

Thank you for reading, and for offering your feedback!

`borrowing` and `inout` pattern bindings

We want to introduce borrowing and inout bindings as a general language feature, and they are also an essential feature for noncopyable switch patterns. We want a binding syntax for patterns that can be consistently applied to freestanding local binding declarations. As a starting point, I’ll use borrowing x and inout x as the syntax for borrowing and inout pattern bindings respectively, since those keywords are consistent with what we currently use for parameter ownership modifiers.

Ownership behavior of patterns

Let’s look over all the different kinds of pattern Swift currently supports and work through the ownership behavior they require:

Binding patterns in their current forms, let a or var a, take part of the matched value and bind a new variable to it. The new variable has independent ownership, so in the general case, it has to consume the matched value. The new binding forms, borrowing a and inout a, would be able access the part of the valueby borrowing or mutating the part of the value they match, respectively.
Wildcard patterns _ discard the matched part of the value. This is ownership agnostic and can be considered to borrow, “mutate”, or consume the matched value if necessary.
Tuple patterns (_, _) break a tuple down into its elements, and then match each element against the corresponding subpattern. The tuple destructuring itself is ownership agnostic, since we can consume a tuple to allow the elements to be consumed, borrow a tuple to provide borrows of the elements, or exclusively access the tuple to allow exclusive access to each of the elements. As such, the ownership behavior of the tuple pattern itself can come from the needed behavior of its subpatterns.
Enum patterns .case(_, _) match when an enum contains a value of the specified case, and then match the element(s) of the associated value, if any, to the corresponding subpattern(s). This is also ownership agnostic, and the ownership behavior can arise from that needed by the subpatterns.
Optional unwrapping patterns _? are essentially sugar for the enum pattern Optional.some(_), and so are also ownership agnostic.
Boolean patterns true and false can test the boolean value while borrowing it.
Dynamic cast patterns is T or _ as T dynamically cast the matched value to T, and if the cast succeeds, tests the cast result against the subpattern (or succeeds immediately in the case of is T). Dynamic casting isn’t currently supported for noncopyable types, but if it were, many forms of cast would need to transfer ownership from the cast operand to the result (for instance, to wrap it in an existential in the case of an as P cast), so in the general case a dynamic cast would have to consume the value being matched. We may be able to relax this in the future for certain kinds of cast where the result can always be borrowed out of part of the original.
Expression patterns take the value of an arbitrary expression and match it against the value being matched using the ~= operator. The ownership behavior of an expression pattern has to depend on the ownership of the parameter to the ~= overload chosen for the match. Most of the standard library’s ~= implementations only need borrowing access in practice, and this is likely to be the common case.

The aggregate patterns (tuple and enum) are ownership agnostic themselves, but can contain zero, one, or many subpatterns, so we also have to consider the composed ownership behavior of compound patterns. Luckily, there is a strict ordering of capabilities among the three ownership behaviors: any valid value the code has access to can be borrowed (assuming there are no exclusive accesses in action for the duration of the borrow). On the other hand, a value can only be exclusively accessed if the value’s exclusivity can be proven, but code that does have exclusive access can provide shared borrows to the value too, temporarily giving up exclusivity. And finally, a value can only be consumed from a context with full ownership of the value, though with full ownership, code can give out either exclusive or shared borrows. Therefore, we can say that the ownership behavior of an aggregate pattern is the strictest ownership behavior of its components: if all of the component patterns can borrow, then the pattern as a whole borrows. If any component pattern requires exclusive access, but no components need to consume, then the aggregate pattern is mutating. And finally, if any component pattern consumes, then the aggregate pattern consumes.

Some examples:

case _: // borrowing
case let a: // consuming
case borrow a: // borrowing
case inout a: // mutating
case (borrow a, borrow a): // borrowing
case (inout a, borrow b): // mutating
case (borrow a, let b): // consuming
case (inout a, let b): // consuming

Determining the ownership behavior of a pattern match

To determine the overall effect of a switch on its subject, we can choose to:

require a syntactic marker on the switch subject itself, or
infer the necessary ownership from the patterns applied

or some combination of the two. From surveying the pattern forms above, it seems to me that the ownership requirements for a switch should be determinable from the patterns in the switch during type checking, so a syntactic signifier isn’t strictly necessary. Nonetheless, for mutating pattern matches, we may at least want to require the & marker like we do for inout arguments in function calls:

switch &x {
case .foo(inout foo):
  modify(&foo)
}

SE-0390 imposed the requirement that switching over a noncopyable local variable be written with the consume operator, switch consume x { … }, as a way of future-proofing in case we did need to drive a syntactic wedge between borrowing and consuming switches, but we could choose to relax this requirement. We currently don’t require any syntactic distinction between borrowing and consuming parameters in function calls, so it would b434 consistent to say that there is no syntactic distinction necessary between borrowing and consuming pattern matches.

Ownership control in `if let` and `if case`

We also allow forms of pattern matching in if, while, and guard conditionals, using the let/var and case pattern forms (often called if let and if case colloquially, even though they can also be used with while and guard). if let and if var can be looked at as a shorthand for pattern-matching an Optional, as if by if case .some([let|var] x), so if we introduce borrowing and inout pattern bindings, then it’s reasonable to expect these new binding forms to be usable for Optional unwrapping, as if borrowing x = optional and if inout x = &optional.

Meanwhile, if case is like a simplified switch against a single pattern, so the ownership behavior of an if case can be determined from that one pattern’s ownership requirements. The right-hand side of the = would behave like a switch subject, needing a & when the pattern is mutating but otherwise accepting a bare value:

if case .foo(let x) = value { ... } // consuming match

if case .foo(borrowing x) = value { ... } // borrowing match

if case .foo(inout x) = &value { ... } // mutating match

Slava_Pestov · November 11, 2023, 12:11am

Does this mean indirect enum payloads now become copy-on-write?

Jumhyn · November 11, 2023, 12:13am

Are there still benefits to be gained for the user in a situation like this, or would it end up virtually identical to a situation where all component patterns were consuming? I.e. should we potentially error/warn if one component pattern consumes but not all do?

bbrk24 · November 11, 2023, 1:05am

Similarly, I think there should be a warning for switch &foo if all branches borrow.

ellie20 · November 13, 2023, 7:14am

I think it might be beneficial to go further and not require syntactic distinction between borrowing and consuming variable bindings. A while ago I made a post suggesting an ownership inference system instead, where variable bindings are ownership-agnostic. And since borrowing is strictly less capable than consuming, as far as I know, this would be completely backwards compatible.

I'm concerned about an unnecessary cognitive burden, since when a user binds a value to a variable, they just intend to use the value later. They may not have completely thought through whether or not that later usage will require consuming. I don't think binding a value to a variable is a particularly important place to explicitly spell out ownership, compared to the point where the value is actually consumed, which for function call arguments is already elided.

Dmitriy_Ignatyev · November 18, 2023, 4:08pm

I suppose it is better to treat both of these cases as borrowing.
if case .foo(let x) = value { ... } in general is used for read only non mutating access, no copies of associated value are needed.
Someone can write if case .foo(consuming x) = value { ... } if needed.

technogen · November 19, 2023, 1:36pm

I was just going to ask the same question!

This feels like the same issue as private versus fileprivate at global scope: they both do the same thing, but one of those is misleading (which is why I always advocate for never using private at global scope).

Except in this case, it's arguably also dangerous: the meaning of an inout pattern binding can change depending on existence of a consuming binding. Granted, the size of the code block that has to be inspected to determine the actual behavior is rather small, but the principle still stands: not only does inout not always mean inout, but it can switch between the two behaviors for reasons outside of the inout declaration itself.

In light of this, I strongly vote for making this an error (not even a warning).

xwu · November 19, 2023, 4:18pm

In what way is private at global scope misleading?

In what way does the inout binding change meaning?

bbrk24 · November 19, 2023, 4:37pm

The more I think about it, the more I think that let should be borrowing and var should be consuming.

technogen · November 19, 2023, 4:52pm

It behaves exactly like fileprivate, yet it looks like it is should be more restrictive than fileprivate (which it is in every other case).

If I understood the premise right (please correct me if I'm wrong), the inout binding can shift between mutate this part of the storage in place and move the entire value leaving the original storage uninitialized, mutate this part of the copy, then initialize the storage with the copy.

And I don't even know what will happen to the other part (that is consumed). We can't have a half-initialized enum case, can we? If we can't, then seems like inout would have to behave like var, given that the other part is consumed and we can no longer re-initialize the storage.

Aside from vastly different performance implications and the potential for confusing behavior (inout behaving like var), it can have more tangible behavioral differences (e.g. in light of C++ interop, where initialization and assignment can do different things).

Regardless, I'm very excited for this feature and I'm very thankful for @Joe_Groff for putting in the effort to make this happen!

wadetregaskis · November 20, 2023, 6:43pm

It'd be interesting to see a pitch that removes private at global scope, with a simple FixIt to replace such use with fileprivate. As a source-breaking change it'd have to be Swift 6, but that's not too far away.

While I don't think the current situation is a big deal by any means, I agree it'd be cleaner if private were more consistent; if it could only be used in localised contexts to mean "private to this context"… where 'this context' is in a nutshell delineated by curly braces. Having private sometimes mean the whole file, when every other time it's used specifically to mean not the whole file, is weird. It's especially weird given a much less ambiguous keyword (fileprivate) already exists specifically for that purpose.

xwu · November 20, 2023, 6:56pm

The core team's intention with private was that fileprivate would be used rarely, if ever; the recommended spelling at the file scope is private. It is essential to the nature of private that it designates a different effective visibility when it is written in a different scope.

jeremyabannister · November 20, 2023, 7:03pm

I’m not at my computer now to verify but I’m pretty sure the following compiles:

func checkIsEven (_ int: Int) -> Bool {
    int.isEven
}

private extension Int {
    var isEven: Bool {
        self % 2 == 0
    }
}

To me this seems unclear because you might think isEven would only be usable from within extensions on Int. I use fileprivate all the time.

xwu · November 20, 2023, 7:13pm

That would be a misunderstanding of private: it refers to the lexical scope in which it's written, which in this case is the file scope. In fact this use of private extension, with the addition of support for stored properties in same-file extensions, is the only way in which the envisioned removal of fileprivate would be possible.

jeremyabannister · November 20, 2023, 7:19pm

I can’t swear I’m 100% on the precise definition of “lexical scope”, but if I’m understanding it correctly then wouldn’t that imply that putting private on the computed property itself instead of on the extension would cause the declaration to be private to that particular extension? (Which of course is not currently how it works)

jrose · November 20, 2023, 8:15pm

Y’all, please stop relitigating fileprivate in the thread for pattern matching. It was an analogy, it didn’t land for everyone, oh well.

xwu · November 20, 2023, 8:33pm

Ah, but I wouldn't characterize it so much as litigating, but clarifying the analogy
Best continued in its own separate thread, certainly.

fclout · November 27, 2023, 2:29am

Pattern-matching with borrowing and inout has consequences with the law of exclusivity (you shouldn't be able to touch x in the body of switch &x). I like that & makes it clear you're giving up the x binding until the end of the pattern-matching statement and I wish that we had a marker for borrowing too for that reason.

Design concerns for borrowing and inout pattern matching

borrowing and inout pattern bindings

Ownership behavior of patterns

Determining the ownership behavior of a pattern match

Ownership control in if let and if case

`borrowing` and `inout` pattern bindings

Ownership control in `if let` and `if case`