Allow let-to-pointer conversions

(Thomas Roughton) #1

The rules for conversion from arguments to pointers are somewhat inconsistent today. vars may be directly passed as arguments to functions taking UnsafePointer<T> or UnsafeMutablePointer<T> arguments by taking the var inout. However, no such capacity is provided for lets unless that let is an array, in which case an array-to-pointer conversion takes place provided the argument is an UnsafePointer.

The current solution to this is to make an intermediate copy for any let variable:

func someFunction(arg: UInt32) {
    var arg = arg
    someOtherFunction(argByReference: &arg)
func someOtherFunction(argByReference: UnsafePointer<UInt32>) {

This can be fairly clunky, particular when dealing with C APIs that take arguments as const pointers or that use pass-by-reference everywhere.

With the introduction of withUnsafePointer<T>(to: T), let-bound variables can be converted to UnsafePointers without creation of temporaries. To avoid a copy (for larger values), the above could be written as:

func someFunction(arg: UInt32) {
    withUnsafePointer(to: arg) { arg in
        someOtherFunction(argByReference: arg)

I suggest extending the language to implicitly convert from let variables to UnsafePointer arguments, with behaviour equivalent to withUnsafePointer<T>(to: T) in the same way that inout-to-pointer maps to withUnsafeMutablePointer<T>(to: inout T). With this conversion in place, the above code could be written simply as:

func someFunction(arg: UInt32) {        
    someOtherFunction(argByReference: arg)

One alternative would be to still require the call to pass the argument with &. However, I think that would potentially be more confusing, since in Swift & means "take this variable inout".

Reasons not to do this might be if it impacts type-checker performance in a meaningful way or if it makes code using it significantly more difficult to reason about. In general, though, I don't think it should matter to the caller whether the callee takes its arguments by value or by reference; what the caller cares about is only whether the argument may be modified at the end of the call.

Fastest way to get (const) pointer to struct for inter-operability with C/C++
(Slava Pestov) #2

I think your request is reasonable, but I would suggest looking into if its possible to make the conversion explicit (and the existing conversion explicit, too). The implicit behavior here actually makes the type system hard to reason about in the type checker because you can have inout expressions in positions where the argument itself is not inout. If this were not the case the type checker could filter out certain overloads sooner than it can now.

(Thomas Roughton) #3

In terms of making it explicit, the first thing that comes to mind is something like:

someOtherFunction(argByReference: ^arg)

where ^ converts its argument to an UnsafePointer (maybe with ^& for UnsafeMutablePointer), or

someOtherFunction(argByReference: borrowPointer(arg))

Both the operator and free-function solution would need special semantics to ensure the lifetime was for the entirety of the function call.

I certainly wouldn’t be opposed to either of these solutions, including as a successor to inout-to-pointer conversions. The key from a usability perspective is avoiding the nesting from repeated withUnsafePointer calls - although it may be a separate language construct could better address that need.


FWIW, you can also use Array in place of UnsafePointer, like so:

someOtherFunction(argByReference: [arg])

Also since one normally shouldn’t need to interact with UnsafePointer(except when interop-ing), I don’t think we need another syntax.

(John McCall) #5

There’s already precedent for allowing a non-mutable pointer to storage to be passed with &; we just require the storage to be mutable despite not actually mutating it. (This is actually one of the only ways to make a non-instananeous read access in Swift today.)

I think we should just generalize that to allow non-mutable storage rather than inventing a new operator. If we do invent a new operator, we should merge the old behavior into it. An implicit conversion does not seem like a good idea.

(Thomas Roughton) #6

In this situation, isn’t the fact that the storage is not mutated more an optimisation than the actual semantics of the language? My understanding is that &, today in Swift (or at least released versions), semantically means “pass inout” rather than “pass by reference”. withUnsafePointer<T>(to: T), for example, forms a reference without requiring &.

If accessors change that meaning so that & can also mean “pass/return as a readonly reference”, then I agree having & support let declarations is a natural extension of that. My concern is that if I saw that in Swift code today I’d be confused by the semantics.

(John McCall) #7

I wouldn’t call it an optimization, no. It’s part of the semantics.

Like I said, it’s not abstractly unreasonable to re-syntax that. I just think it ought to be treated the same way as what you’re proposing, as it’s basically the same thing.

(Thomas Roughton) #8

It seems like there are contrary aims for the operator that would enable this:

  • Syntactically, we want the type checker to be able to disambiguate between inout arguments and to-pointer conversions, meaning passing something as an inout parameter would use different syntax than passing a pointer.
  • Semantically, the useful distinction is between read-only reference and write-back reference. In this case, it might make sense for inout and UnsafeMutablePointer arguments to use the same symbol (&, since I don't think changing the inout operator is under consideration) and for read-only references (e.g. yields in _read accessors or UnsafePointer arguments) to use a different symbol.
  • Alternatively, we use & for both meanings, in which case the type checker situation remains as it currently is.

(John McCall) #9

Well-summarized, except that I don’t think there’s a significant typechecker performance issue with this as long as it’s syntactically explicit in some way. Being syntactically ambiguous with inout is not a problem implementation-wise.

(Thomas Roughton) #10

If syntactically disambiguating between inout and UnsafePointer arguments isn't a concern, I don't really personally see a reason to introduce a separate symbol, since & meaning "form a reference" is consistent with C/C++, even if it isn't completely consistent with the sole use of & for inout/write-back references in Swift so far.

Edit: Thinking about it more, one caveat of this approach is it may imply that no & means the argument is not passed by reference, which is not the case when the variable itself is a reference (e.g. for class instances).

In terms of implementation for a formal evolution proposal, would you have any indication for how much work extending & to allow references to lets would be, and what parts of the compiler would need adjustment?

(John McCall) #11

The main thing, I think, is that the type-checker needs to be taught to treat &-arguments specially instead of trying to work them into the normal conversion path — which is to say, we need to get away from InOutType. We can then say that X is &-passable as an UnsafePointer<T> if X is convertible to T, and X is &-passable as an UnsafeMutablePointer<T> if X is an l-value of the exact type T. This is work we want to do already, I think.

(Joe Groff) #12

One argument for a different sigil would be that currently in swift, &x in an argument always means "inout"—even if you're passing the argument by Unsafe*Pointer, the callee is still constrained in what it can do to the pointee by the exclusivity and scoping rules of inout arguments.

Part of the motivation for the original inout-to-pointer conversion was that, although it's effectively sugar for:

withUnsafeMutablePointer(to: &variable) { p in
array.withUnsafeMutableBufferPointer { bp in

we judged that the withUnsafeMutablePointer block was an unreasonable amount of syntactic overhead when dealing with pointer-heavy C and ObjC APIs. If we had a way to return a pointer from an operation with the correct lifetime and writeback semantics but without the nesting overhead, that might be a better, more general alternative to adding more implicit conversion rules.