SE-0483: `InlineArray` Literal Syntax

... not to mention that every expression of form
myValue = (n of T)( ... init arguments ... ) will require braces around n of T anyway, making "less visual noise in code/more compact" claims invalid. :man_shrugging:

1 Like

After being parsed, it still needs to infer the types, doesn't it?
If you have a (potentially) overloaded function (operator) for InlineArray and 'Array`, would not it require checking even more cases making "sudoku" part even more complicated and prone to breaking?

When inferring something like var x: Double = // innocent looking expression with numbers fails, does it fail during parsing or type checking?

If an array literal can be "even more things" than just array, especially if it is another built in basic type used in similar scenarios, wouldn't it make inferring the type even harder? If there is a different syntax for InlineArray literals, shouldn't it make type checking easier after parsing stage?

The Future Direction section proposes extending this syntax to initializers, but, as the proposal itself acknowledges, the ASCII character x is very similar to the multiplication sign ×. Consequently, [5 x 99] reads like [495], making the symbol feel more like an operator than syntactic sugar. Using x for array-size sugar is therefore confusing, and the ambiguity only deepens once value literals are introduced:

// type inferred to be [5 x Int]
let fiveInts = [5 x 99]

Anyway, I’m in favor of putting this on hold until we gain real-world experience, since InlineArray has yet to ship publicly and any sugar now would be premature.

3 Likes

Conceptually, it is an operator that vaguely can be interpreted as "repeat" value on the right side, when right value can be either a type or a number, and it has different meaning than *.
I doubt any one will confuse * with x, as no one confuses × with when doing math.

1 Like

If you have two overloaded functions where one accepts an Array and the other accepts an InlineArray, then yes, that will affect type checking. The effect would be the same regardless of whether a sugared syntax is used or not. That being said, there will likely be little reason to overload a function in this way once we have the new collection protocols and Span.

It fails during type checking. The error actually tells you this:

error: the compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions

I assume, from the nature of this thread, that by "array literal" you mean the Array type sugar and by "InlineArray literal", you mean the proposed InlineArray type sugar. (Generally, "array literal" is used to refer to the [a, b, c] syntax for expressing a list of values.)

If so, then no, it wouldn't make any difference at the type checking stage. Because the Array type sugar looks like [T] and the InlineArray type sugar looks like [w x T], the parser can easily tell the difference between the sugar for Array and the sugar for InlineArray.

This proposal should probably be renamed. It’s not proposing an InlineArray literal syntax; it’s proposing syntax sugar for the InlineArray type. A literal syntax for InlineArray is discussed within the proposal, but it’s a future direction.

7 Likes

That's what I mean. If InlineArray and Array literals have the same form, the expression [a, b, c] can mean a lot of things, putting more stress on the type checker. It might become even worse, if something like ExpressibleByInlineArrayLiteral is introduced in the future.

Basically almost all the code written for Array is ad-hoc translatable to InlineArray. So these will probably get overloaded a lot - but generics would be the proper way to handle these. The question is whether generics make type inference easier than explicitly overloaded functions? I was under an impression, that they work like a syntactic sugar for generating overloaded functions. If I am wrong, I would be grateful for an explanation if you have some time (or a linke). Protocols should make it easier to write generic code - but would they make type inference easier? If it works just like introducing more overloads (only my uninformed hypothesis), why wold it make type inference easier?

I’m fine with the x, but I’m a bit uneasy about the use of [ and ].

The name InlineArray was chosen to communicate that the array is precisely that: stored inline. However, nothing about the form [100 x Int] suggests inline storage. In fact, one could argue that it implies the opposite! It would be perfectly reasonable for a newcomer to Swift to assume that any collection type written with [] is stored indirectly—since that’s true for Array and Dictionary, which are the bread and butter of collection types.

It absolutely feels like some developers, especially beginners, will write [100 x Int] without realizing the performance implications. That risk seems lower if the sugar were (1024 x Int) or, as @michelf has suggested, just 1024 x Int.

7 Likes

@hborla @Ben_Cohen
Just a strange idea, I am not sure how plausible or hard to implement it is. Perhaps the operator should be changed to - the unicode character, while allowing typing x instead? Then it could be automatically converted to during the parsing stage, and I assume that most of editors that work with Swift could do it automatically, and the final result will look as in the editor, making it less confusing.
It solves both the keyword/identifier problem, since the operator is explicitly and not x and makes typing trivial. Other than that, it has the same virtues as x keyword. (readability, etc).

This is something easily avoided with proper documentation. The moment they are introduced (and ideally they should be introduced together in any beginner teaching matherial), the difference should be clearly explained and emphasized. And the difference should be documented in any resource targeting experienced users who are familiar with regular arrays.

"inderectly" is an implementation detail. They all have value semantics after all. For users advanced enough to understand the implications of COW, explaining the difference should not take more than a few minutes.

Are you referring to the cross-product versus dot-product distinction? That understanding may be culture- or locale-specific, but for me—when we’re talking about scalars—×, , and * are simply interchangeable notations for ordinary multiplication. And [5 x 99] is certainly not a vector multiplication. As supporting evidence, the README of the mathematics package Euler—which has about 1.2K stars—shows exactly this with the example 3 × 4 // 12.

2 Likes

For value syntax ideas, just in the interest of sharing something I thought was quite interesting, I was looking at what Python programmers might expect from a "fixed size array" when I found this stackoverflow question.

Basically, the poster wants to know how to create an empty list with a defined size, and the answer turns out to be:

>>> lst = [None] * 5
>>> lst
[None, None, None, None, None]

I think the [None] * 5 syntax is kind of interesting. I'm sure it'd be controversial but I kind of don't mind it. I wouldn't advocate for it as such (I don't care that much so long as what we end up with is clear), but there is something about it. It may be that it works in Python better than it would in Swift.

Well, actually this is multiplication of two one-dimensional vectors...

2 Likes

It wouldn't be the first time that we decided to add sugar to a feature at a later stage. The if let <optional> as a shorthand for if let shadow = <optional> comes to mind. And I'm pretty sure there are more.

And I do understand that there is a cost to postponing a decision (also, it's a decision too). Yet, I don't really see the opportunity cost if we decide to implement this sugar at a later date.

However, if we should answer the question as: "is the problem being solved big enough to warrant this change/feature?". I would say I'm not convinced it is. So I'd say no.

1 Like

While it's true that "5 x 10" is in common shorthand for "5 by 10", seeing that by is already quite terse at 1 one letter more, picking x to use in its place is apparently only for:

  1. cuteness,
  2. for easy confusion with symbols * and × (nice reference above to the Euler package, useful!),
  3. to introduce ambiguity when trying to use a value x when, oh, "x" is only the poster child of variable names in this minor field called mathematics, plus pervasively used as the first spacial coordinate name in all of math and computer programming. :roll_eyes:

Okay, 3 is a bit argumentative, so enough ragging on x being chosen instead of by (except to add that I believe it would be nearly unprecedented to add a single-letter keyword to a non-codegolf programming language, couldn't help myself).

But mainly I would like to concur with a notion mentioned above but skipped over without much comment: both x or by are bad choices if seeking inspiration from common parlance because they're used when measuring dimensions of similar units, not counts of things. You use it to say "5 by 10 inches of countertop", but not "5 by Eggs"!

This is why [5 of Egg] is the obvious choice for this syntax in my opinion.

And while I like some of the thinking behind omitting the square brackets, I think it's too radical a departure. FixedArray are array-like and the syntax should reflect that.

(No /s)


However let's pretend it's a little over a month ago and I'll make two more suggestions:

Modest proposal 1: If we're looking to natural (english) language for influences, how would we really say "5 of Egg", it would of course be "5 Eggs".

So let's introduce a syntax for defining the plural of a type name, something like

struct Egg @plural(Eggs) {
    ...
}

extension Element @plural(Elements)

Then to use the new syntax to declare a FixedArray of a type you'd need to use the plural form of that type with the syntax [<constantvalue <pluraltypename>] with no operator or keyword, simply [5 Eggs].

Future bikeshedding is needed for what to do with singular-plural names like Buffalo and Moose (I'm sure there are more examples that aren't animals), and also when the constant value is 1. (1 Eggs, or make exception and allow the non-plural type name here [1 Egg]?)

Modest proposal 2: take a page from early in Swift language design where for closure syntax they chose to re-use the keyword in, simply do the same here!

[5 in Egg] where "in" is shorthand for "instances of".

(Very much /s)

An inline array is similar to both an array and a tuple. Like a tuple, the storage is inline and passing a copy around implies copying all the elements. Like an array, it is homogenous and its elements can be accessed with a subscript.

Whether to use brackets or not would depend on which part you want to emphasize. Personally I'd rather emphasize the like-a-tuple inline storage part because it's a hidden cost. The like-an-array part is how you use the type: this is going to be self-evident in usage and will have no hidden cost.

To me, copying 300 of Int fells like copying 300 Ints, whereas copying [300 of Int] could appear to be the same cost as copying an array of 300 Ints unless you know better.

5 Likes

It would be right-associative, because only the last one is going to be a type, so it doesn't make sense to parse it any way other than:

3 of (3 of (3 of Int))
1 Like

Since people have been talking about Unicode characters here, I want to point out once again, as I mentioned in the pitch thread, that the proposal uses the wrong Unicode symbol for multiplication:

I’m sure the proposal author @Ben_Cohen simply missed my comment there, since it was a very lengthy and busy thread full of vibrant discussion, but I do want to make sure everyone is on the same page about Unicode multiplication symbols.

To put it simply, “⨉” (n-ary multiplication) is to “×” (times) as “∑” (n-ary summation) is to “+” (plus).

4 Likes

Isn't the counterpart?

Anyway, I don't think it's realistic we would ever have non ascii symbols as or × in Swift (aside from user defined operators).

1 Like

This is very much off topic, but every n-ary operator is a counterpart to “∑”. There are at least 17 such characters in Unicode, and LaTeX allows any symbol to become an n-ary operator.

“∏” means “product” and is used for repeated multiplication. It is fairly common.

“⨉” means repeated “×”, and is extremely rare. For example, it might get used for an n-ary direct product, cross product, or Cartesian product.

My point is that the proposal should not mention any n-ary operator at all, because it is only proposing a binary operator. An equivalent situation would be if the proposal had listed “∏” instead of “*” as an alternative: 5 ∏ Int is obviously wrong, for the exact same reason that the n-ary “⨉” is wrong.

1 Like