Should labeled variadic parameters accept 0 arguments?

rxwei · May 9, 2018, 1:01am

Background

Currently, Swift accepts 0 or more arguments for a variadic parameter. This makes sense for some functions, but not necessarily for functions that have a label on the variadic parameter.

TensorFlow has a method called sum(alongAxes:). This method performs a reduction on Self along specified axes. It's defined as follows:

public extension Tensor where Scalar : Numeric {
  /// Returns the arithmetic mean along the specified axes. The reduced
  /// dimensions are retained with value 1.
  /// - Parameter axes: The dimensions to reduce.
  /// - Precondition: Each value in `axes` must be in the range `-rank..<rank`.
  func sum(alongAxes axes: Int...) -> Tensor {
    ...
  }
}

We expect this function to be called with 1 or more arguments, for example:

let x: Tensor<Float> = [[[1, 2]], [[3, 4]]]
x.sum(alongAxes: 2) // [[[3.0]], [[7.0]]], the sum along axis 2
x.sum(alongAxes: 0, 1) // [[[4.0, 6.0]]], the sum along axes 0 and 1

However, when this functions called with no arguments, the result becomes completely confusing.

x.sum() // [[[1, 2]], [[3, 4]]]. Reduced along **no** axis!

This is unexpected to the user, because the call site x.sum() without argument label alongAxes: directly implies that this is a summation of all elements. Currently there's no way to make this method reject 0 arguments!

To resolve this, we had an overloaded method in the TensorFlow library:

func sum() -> Scalar {
  // Reduce along all axes (all elements).
  // Reshape to a scalar.
  // Return the scalar.
}

In most cases when users call x.sum(), it refers to the no-argument sum(). However, when there's a contextual type Tensor, calling x.sum() would still refer to sum(alongAxes:) and make it a no-op!

Note: While making the no-argument sum() return a Tensor<Scalar> instead of Scalar can completely shadow sum(alongAxes:) at call sites where there's no arguments, we don't want to do that because the result shape of "sum of all elements" is guaranteed to be a scalar. There can be a more systematic solution than requiring library designers to overload and shadow things.

Possible Solutions

1. Reject zero arguments when the parameter has a label

func mean(alongAxes: Int...) -> Tensor { ... }

x.mean()
  ~~~~^ Variadic parameters with an argument label requires at least 1 argument

2. Introduce a parameter attribute to specify one-or-more arity

func foo(alongAxes: Int...) -> Tensor { ... }
func bar(alongAxes: @oneOrMore Int...) -> Tensor { ... }

x.foo() // ok, same as the current behavior

x.bar()
  ~~~~^ Variadic parameter requires one or more arguments

Nevin · May 9, 2018, 1:03am

Is it not sufficient to write it like this?

func sum(alongAxes firstAxis: Int, _ otherAxes: Int...) -> Tensor {
  ...
}

griotspeak · May 9, 2018, 1:04am

I'm not always pleased by this solution, but you can label the first argument, make it a single element, and follow it with an unlabelled variadic parameter for the effect that you are after.

func mean(alongAxes head: Int, _ tail: Int...) -> Tensor { ... }

rxwei · May 9, 2018, 1:09am

That would work for the type checker, but is unfortunately very complicated with TensorFlow. We need to be able to turn the argument list to a constant array to be passed to the #tfop syntax as an "array attribute". For example:

#tfop("Mean", someAxesAttribute: [1, 2, 3])

If they are passed separately, there's no way we can concatenate the first argument with the tail to make a constant array, until Swift has a constant expression model.

rxwei · May 9, 2018, 1:47am

The bigger problem is: this doesn't have performance guarantees. If the callee wants to use all arguments (first + rest) as a single array, this will require a really inefficient concatenation.

jrose · May 9, 2018, 2:55am

I'd be happier with changing this rule if we had an array splat/spread operator to go with variadics (which we keep talking about but not doing). Then you could still explicitly pass zero arguments to the variadic parameter, just not implicitly. (We'd also probably have to continue allowing the zero-arguments-by-omitting-the-label syntax in Swift 4 mode.)

Tino · May 9, 2018, 2:29pm

Two unrelated features on top my wishlist would solve most of your issue as a byproduct:
Replacing variadics ([Discussion] Variadics as an Attribute) and constants as generic parameters (Proposal: Compile-time parameters).
With those, your signature would be

func mean(_ alongAxes: @variadic MinimalSizeArray<Int, 1>) -> Tensor

I think it's better than adding yet another special case (even if it takes longer to be implemented).

rxwei · May 9, 2018, 6:28pm

Thanks for the pointers. This seems to add a lot of complexity to solve a simple problem, even though generality is great. Plus, this is going to require lots complex things all at once: constant expression model, constant generics and fixed-size arrays.

Apart from compiler complexity, @variadic MinimalSizeArray<Int, 1> (or any nominal type with generic params to represent variadics) is very very heavyweight for users to understand when compared to a simple syntax like Int....

Tino · May 9, 2018, 8:37pm

I don't think Int... is simple at all:
It is a type with a special syntax, you can't use it anywhere but in method signatures, and when you do so, the parameter magically turns into [Int].
For users (those who call the method), nothing would change under Haravikks proposal, and the author of a variadic method would get much more power and a less special syntax.

It would require all those things, but not necessarily all at once:
One after another would be just as fine, because every part is useful on its own.
I don't think the easy, impatient way is better in the long run, because a system build on myriads of small special cases sooner or later will be more complex and less managable than a design with a small number of universal features that can be combined freely (even if those features are tough to implement).

rxwei · May 9, 2018, 8:49pm

I disagree. From a usability perspective, Int... is a widely accepted syntax for variadics and has precedents in other familiar languages. How is a user supposed to easily understand @variadic MinimalSizeArray<Int, 1> means "variadic with more than one Ints" instead of "variadic with multiple values each having type MinimalSizeArray<Int, 1>"? This is complexity and generality at the cost of confusing users.

Tino · May 9, 2018, 9:42pm

And that makes it simple?
Imho variadics are one of the most complex features in C (and they still aren't easy in C++ :-), and as much as I'd like Swift to steal more good concepts from other languages, I wish we would als try harder to be better.
C#, for example, seems to have learned the lesson (just read about params — I thought the concept was completely new).

Afaics, the three dots are the most common way to indicate variadics, but when you refer to precedents:
Is there any language where you can specify that a variadic argument has at least one value?
And if so: What if someone needs two or three values?

rxwei · May 9, 2018, 9:58pm

And that makes it simple?

Yes, Int... is simple! It's lightweight. It is the most common and pragmatic way today to indicate variadics. It is immediately understandable to the user (even if it doesn't specify how many args). It follows the progressive disclosure of complexity. IMO, these are the first principles to begin with, and then we can talk about extended expressivity on top of these principles. Dropping a heavyweight, fully general syntax directly onto the user is not what Swift has been doing.

huon · May 9, 2018, 10:53pm

Doesn't taking an arbitrary Int... also fundamentally not work with constants? The user could have their own head and tail and pass x.mean(alongAxes: [head] + tail) and so be equally non-constant. Could you expand?

This is a "worse-is-better" point, but what's are the typical number of axes that these tensors will be summed across? I can't imagine it'll be more than a dozen in almost all cases, in which case the concatenation is of course slower than just appending at the end, but it's not ridiculous. (Taking any arguments as an allocated Array seems like a more fundamental performance problem here.)

rxwei · May 9, 2018, 11:20pm

This goes into the implementation detail. The compiler inlines everything and rejects non-constants through data flow analysis.

With a single variadic parameter:

func foo(_ xs: Int...) {
   #tfop("SomeOp", someAttribute: xs)
}
foo(1, 2, 3) // ok, all elements are constants and they are used directly as an array

With first-and-rest parameters:

func foo(_ x: Int, _ xs: Int...) {
   #tfop("SomeOp", someAttribute: [x] + xs)
}
foo(1, 2, 3) // no, the compiler hasn't been taught to recursively handle Array.+ and Array.init

In theory, we can still hard-code the compiler to handle concatenations like this. But the principled approach would be to introduce a constant expression model so that the compiler can fold things like constant array concatenations. In any case, this is trying to answer the "how we reject constant argument" question, which is not directly on topic.

A very small number of axes for sure. A constant expression model would make our problem go away and we'll be able to use [x] + xs. But I guess it's still off topic. Other TF folks are preparing a constant expression pitch.

Chris_Lattner3 · May 10, 2018, 3:57pm

We don't have a model developed for this yet, but I'd rather see this written someday as:

 func sum(alongAxes axes: Int...) -> Tensor 
   precondition(axes.count > 0) {

-Chris

allevato · May 10, 2018, 4:07pm

Assuming the constant expression model and generics support, could it be tied even more closely into the existing constraints syntax?

func sum(alongAxes axes: Int...) -> Tensor where axes.count > 0 {
  // ...
}

Chris_Lattner3 · May 11, 2018, 4:21am

It depends on the design, but if the general case of pre/post conditions have inherently dynamic semantics (as I expect they would) then I don't think it make sense to merge them into where clauses.

nh7a · May 11, 2018, 5:57am

I think it can make more sense if it would reject zero arguments unless a default value was given, just like a non variadic parameter, and allow a variadic parameter to have a default value.

func foo(alongAxes: Int... = []) -> Tensor { ... }
func bar(alongAxes: Int...) -> Tensor { ... }

x.foo()  // ok: alongAxes will be [] by default
x.bar()  // error: missing argument label 'alongAxes:' in call

Letan · May 11, 2018, 9:48am

Forgive me for misunderstanding, but if the conditions were to have dynamic semantics what would the difference be by having the precondition on the function signature versus having it in the body?

michelf · May 11, 2018, 10:36am

I assume preconditions as part of the function signature would make them enforced by the caller. This leaves room for the compiler to check them statically with constant folding. When not successful in reducing the precondition to a constant, it stays as a dynamic check on the caller's side.