Declaration-Like Argument Blocks

pthariensflame · April 13, 2020, 12:34pm

There have been a fair number of proposals for making multiple closure arguments to a function look nicer and be more readable—most recently SE-279. These have all come up against ergonomic issues and lack of community consensus because they assume something close to the trailing closure syntax is what’s needed. I’d like to propose a very different approach, inspired directly by a hitherto unique feature of the language Ceylon, and which counterintuitively makes a complicated call site easier to read by increasing its length and verbosity (but also its expressiveness and formattability). I call these Declaration-Like Argument Blocks (or DLABs for short), and the intuitive idea is to make a complicated call site look a lot like a class or struct declaration.

For example, let’s say we have a function of this signature:

// Does some complicated iterative calculation,
// using initial bounds and updating them along the way
func messy(count: Int,
           inout bounds: (Double, Double),
           onStart: (Int, Double) -> Double?,
           onCrossBound: (Int, Double, Double) -> (Double?, Double),
           onEnd: (Int, Double) -> Bool,
           initial: Double,
           morphIntermediates: Double -> Double?)
    -> (Double?, [Double])

Traditionally, we’d need to call it like this, and even the previous multiple trailing closure proposals couldn’t save us because initial is in the way:

var bounds: (Double, Double) = (0.0,1.0)
messy(count: 200,
      bound: &bounds,
      onStart: { numDiscs, guess in
          // give a better initial estimate based on the guess
      },
      onCrossBound: { numDiscs, estimate, boundCrossed in
          // give a better estimate and a new bound
      },
      onEnd: { numDiscs, estimate in
          // decide whether the final estimate is good enough
      },
      initial: 0.5,
      morphIntermediates: { estimate in
          // fix up the estimate between each iteration
      })

With DLABs, it instead looks like this:

messy() where {
    let count: Int = 200

    public var bounds: (Double, Double) = (100, 200)

    func onStart(numDiscs: Int, guess: Double) -> Double? {
        // give a better initial estimate based on the guess
    }

    func onCrossBound(numDiscs: Int,
                      estimate: Double,
                      boundCrossed: Double)
        -> (Double?, Double)
    {
        // give a better estimate and a new bound
    }

    func onEnd(numDiscs: Int, estimate: Double) -> Bool {
        // decide whether the final estimate is good enough
    }

    let initial: Double = 0.5

    func morphIntermediates(estimate: Double) -> Double? {
        // fix up the estimate between each iteration
    }
}

This syntax not only allows for multiple “trailing closures”, but allows all arguments to be passed this way. You could also opt to pass some of the arguments the “normal” way too, like so:

var bounds: (Double, Double) = (0.0,1.0)
messy(count: 200, bounds: &bounds) where {
    func onStart(numDiscs: Int, guess: Double) -> Double? {
        // give a better initial estimate based on the guess
    }

    func onCrossBound(numDiscs: Int,
                      estimate: Double,
                      boundCrossed: Double)
        -> (Double?, Double)
    {
        // give a better estimate and a new bound
    }

    func onEnd(numDiscs: Int, estimate: Double) -> Bool {
        // decide whether the final estimate is good enough
    }

    let initial: Double = 0.5

    func morphIntermediates(estimate: Double) -> Double? {
        // fix up the estimate between each iteration
    }
}

You could even have both a (single) trailing closure and a DLAB:

var bounds: (Double, Double) = (0.0,1.0)
messy(count: 200, bounds: &bounds) { numDiscs, guess in
        // give a better initial estimate based on the guess
    } where {
    func onCrossBound(numDiscs: Int,
                      estimate: Double,
                      boundCrossed: Double)
        -> (Double?, Double)
    {
        // give a better estimate and a new bound
    }

    func onEnd(numDiscs: Int, estimate: Double) -> Bool {
        // decide whether the final estimate is good enough
    }

    let initial: Double = 0.5

    func morphIntermediates(estimate: Double) -> Double? {
        // fix up the estimate between each iteration
    }
}

This gives a lot of options for expressive call sites, and consequently for DSLs. The use of the keyword where is pretty bikesheddable, but something has to be used to distinguish a DLAB from a standard single trailing closure and that choice of keyword seems the most readable to me in context.

Note that we can support pretty much all of the basic struct/class member definition syntax other than init, deinit, and subscript. Properties that we care about using afterwards we can use public for, properties that we don’t care about using afterwards we can use no visibility for, and we can define computed properties of both read-only and read-write forms and have them work just fine, even making use of observers if we want. Function-typed parameters can be satisfied using func declaration syntax or using a property of the appropriate type. Type parameters, when we aren’t letting them be inferred, can be specified with typealias declarations in the DLAB just as easily as in the traditional way before the parameter list. Autoclosure parameters fulfilled by a DLAB-supplied computed property have appropriately deferred evaluation. Function builders can be used by explicitly declaring them in exactly the same way you would in an actual struct, class, or extension declaration. Local function and property declarations that aren’t used as arguments to the call can be declared in a DLAB as private items.

We have a whole world of existing functionally available for usage, at no syntactic cost; our call sites can look like large parts of Swift have always looked like!

I’d love to hear what the community thinks of this, and if I should go ahead and make this a proper proposal.

karim · April 13, 2020, 2:04pm

Not sure what I think about this yet, but there are a couple of trailing commas in your proposed syntax just after the “closure declarations” that I think are unintentional? Just wanted to point those out so you could either remove them, or... further explain why they’re there?

pthariensflame · April 13, 2020, 2:07pm

They were indeed unintentional and have now been removed! Thank you!

technogen · April 13, 2020, 2:55pm

I can’t say I love it exactly the way it’s described, but I definitely love the general direction of if! It surely looks much more well-rounded then multiple trailing closures! I wonder if the solution to the general problem of “passing too many things” will end up manifested as an “anonymous struct” feature where you’ll be able to declare an anonymous struct and pass it to a function which will decompose that struct into its parameters...

David_Catmull · April 13, 2020, 3:09pm

My first thought as I look at the example is that you could look at the middle of the argument block and easily fail to recognize it as such because it looks the same as the middle of a class. It is appealing to not have the blocks awkwardly crammed into a comma-separated list, but I think it can be easy to lose track of the context of it being a list of arguments.

If this were adopted, some people might decide to always use these blocks, even when none of the arguments are closures. Would this be a good thing? I'm not sure either way, but it's something to consider.

Also, what does it mean for bounds to be declared public in the context of the argument block?

pthariensflame · April 13, 2020, 3:13pm

A public modifier on a declaration in a DLAB would mean that that declaration is preserved outside the scope of the DLAB—so, in the case of bounds in the example, it’s preserving the ability to refer to bounds after the call (and presumably make use of the information imparted by the updated bounds after the iterative process terminated). Anything could be declared public in a DLAB, with the general effect being that it isn’t “just” a parameter anymore, it’s also a property/function/whatever declaration that’s accessible after that point in the surrounding block of statements.

pthariensflame · April 13, 2020, 3:15pm

private is similar, with the effect in that case being that a declaration isn’t accessible outside the DLAB and also isn’t being passed as a parameter. Instead it’s presumably being used as an auxiliary helper (possibly indirectly) by the actual argument-corresponding declarations.

anandabits · April 13, 2020, 3:23pm

pthariensflame · April 13, 2020, 3:31pm

That seems like an almost orthogonal proposal, and one that would play nicely with this one: nameless structs a la struct { /* fulfill requirements here */ } could be used as declarations in a DLAB in the same way as object declaration parameters work in Ceylon.

pthariensflame · April 13, 2020, 3:33pm

It won’t solve things that this proposal does, anyway, because it only treats functions that already take a single bundled parameter, whereas this proposal treats functions with many separate parameters. Given that functions of both sorts are already in stable APIs and ABIs, both sorts of solution are needed.

wowbagger · April 13, 2020, 9:03pm

I like this idea, except for the use of public and private keywords. They are misleading, and likely will cause people to falsely assume an in-out parameter is accessible across the entire project on a less-than-through look. As far as I can see, only in-out parameters are what you classify as those "that we care about using afterwards", so why not just use inout instead of public?

Another question, how would this work with autoclosures?

pthariensflame · April 13, 2020, 9:46pm

We can use any keywords rather than private and public if that’s preferred; bikeshed welcome! We can’t combine them into one thing, through, because they don’t mean the same thing. Neither of them corresponds 1-to-1 with inout parameters; the only reason the sole public declaration in the example happens to be the sole inout parameter’s argument is that it’s preserving the existing visibilities from the original call site, where that variable was necessarily in scope afterwards. If we didn’t care about the value of bounds after the function returned, we could freely make it not public; likewise, we could freely make onStart(numDiscs:guess:) public if we wanted it to be in scope after the call. private declarations aren’t even arguments to the call at all; the whole point of their existence is to be able to provide internal helper definitions whose scope is limited to the call site, exactly the way you would in a normal class/struct/extension.

Interaction with autoclosures is straightforward; if they’re explicitly fulfilled by a (parameterless) function declaration or property of function type, as they can be today by an explicit closure, then that function will get passed in unchanged. If they’re fulfilled by a declaration whose type is the result of the autoclosure, then further evaluation of that declaration is deferred until, and duplicated whenever, the implied closure is called, just as normal—but any evaluation inherent in the form of the declaration would still be performed once on the spot; specifically, this means stored property declarations would be evaluated eagerly, computed property declarations would be deferred and duplicated, and lazy property declarations would be deferred and not duplicated (but reused instead). All of these could freely be public or private without issues.

wowbagger · April 14, 2020, 7:00am

How can the onStart parameter be in scope outside of messy? It's not an in-out parameter.

Can you give an example where a non in-out parameter is in scope outside of the function?

I might have not been very clear in my previous reply. If inout replaces public, then private just becomes unnecessary, similar to how in-out parameters are annotated with @inout, but non-in-out parameters aren't annotated with @noninout. I didn't mean that they should be combined, but rather that one of them should be dropped.

pthariensflame · April 14, 2020, 7:19am

Here’s what the example call and a hypothetical surrounding looks like with bound non-public and onStart and count public instead:

// do stuff beforehand
// ...
let (endVal, approxSeq) = messy() where {
    public let count: Int = 200

    var bounds: (Double, Double) = (100, 200)

    public func onStart(numDiscs: Int, guess: Double) -> Double? {
        // give a better initial estimate based on the guess
    }

    func onCrossBound(numDiscs: Int,
                      estimate: Double,
                      boundCrossed: Double)
        -> (Double?, Double)
    {
        // give a better estimate and a new bound
    }

    func onEnd(numDiscs: Int, estimate: Double) -> Bool {
        // decide whether the final estimate is good enough
    }

    let initial: Double = 0.5

    func morphIntermediates(estimate: Double) -> Double? {
        // fix up the estimate between each iteration
    }
}
// do more stuff; we don’t have access to `bound` here any longer,
// but we _do_ have access to `count` and `onStart`
// ...
// redo the initializing calculation with the final value to check stability
guard let v = endVal,
      let check = onStart(numDiscs: count / approxSeq.count, guess: endVal)
      where check == endVal else {
    // report instability error
}

We can’t get rid of private unless we think its functionality isn’t worth including at all. It’s not the same as no visibility modifier; private declarations don’t correspond to parameters.

Avi · April 14, 2020, 7:19am

IIUC, the idea is that the compiler would treat the declaration as if a temporary variable had been assigned before the method call, and this temporary variable would still be in scope after the call site.

If that's correct, I would say that I don't think the compiler support this. Today, whether you are using inout or regular parameters, if you want a parameter to be in scope after the call site, you have to manually assign a local variable. I don't think this pitch should change that.

pthariensflame · April 14, 2020, 7:34am

That’s correct, and it’s a consequence of the only sensible desugaring of this syntax: every DLAB declaration becomes a predeclared value or local function either right before or right at the start of a new enclosing scope that ends with the desugared call itself, and the differences between public, private, and no visibility are then precisely the differences between where the items are declared and whether they participate as arguments or not:

No visibility means declared inside the scope and used as an argument
public means declared outside the scope and used as an argument
private means declared inside the scope and not used as an argument

(The last possibility doesn’t seem worth including, since if you want something visible outside the scope but not used as an argument you really can just move it outside the DLAB entirely without any issues.)

Avi · April 14, 2020, 7:56am

I generally like the idea you are proposing, but I don't think you should include this scoping mechanism. It makes for a semantic change, whereas the DLAB idea itself is purely syntactical.

pthariensflame · April 14, 2020, 8:04am

That’s a fair complaint; the reason it’s included at all is that there’s no simple way AFAICS to have an outside-declared variable used like that in a declaration. What would the DLAB look like?

func f(inout y: T)
var x: T = /* ... */
f where {
     // this seems unreasonably verbose for such a common case:
    var y {
        get { x }
        set(v) { x = v }
    }
}

Avi · April 14, 2020, 8:33am

I would think one wouldn't bother with a DLAB in this case. It seems overly verbose for a single parameter anyway, even if one is not trying to capture a local variable as an inout parameter.

Perhaps we could support something like this:

func f(inout y: T)
var x: T = /* ... */
f where {
    var y = &x // this parameter is inout, and refers to the local variable 'x'
}

pthariensflame · April 14, 2020, 8:36am

Do you have some idea of how that might plausibly generalize to classes and structs? Not claiming that we should allow that, just that an “obvious” meaning for it seems like a prerequisite for it to be well-behaved intuitively.