[Pitch] InlineArray type sugar

ibex10 · April 20, 2025, 12:40am

That can be simplified further:

let x: [5 Int] = [5 99]

// exchanges the first and third rows of a 3x3 matrix
let op: [3 [3 Int]] = [0, 0, 1, 0, 1, 0, 1, 0, 0]

1-877-547-7272 · April 20, 2025, 12:50am

IMO this is something worth seriously considering.

A major issue I have with [w of T], [w x T], [w * T], and other square-bracketed notations is that they look like sugar for a fixed-size array, not specifically an inline array. This is problematic since, even for basic operations, the performance characteristics of Array and InlineArray are hugely different:

Operation	Cost for `Array`	Cost for `InlineArray`
Move	Moving a pointer	Moving each element
Copy	Doing a retain and moving a pointer	Copying each element
Destroy	Doing a release; destroying each element if unique	Destroying each element
Mutating an element	Doing a uniqueness check; copying each element if not unique; mutating the element	Mutating the element

Since these two constructs behave so differently, I think the sugar we use for them should communicate these differences.

To me, w of T effectively communicates that the type behaves like w instances of T grouped together. It doesn't imply a relation to Array, so people are less likely to have mistaken assumptions about the type's similarity to Array.

[w of/x/*/# T], on the other hand, does imply a misleading similarity with Array.

(This post was edited to improve the readability of the code examples. Thank you @ibex10 for the suggestion!)

toph42 · April 20, 2025, 1:13am

Oh! This thread was my introduction to InlineArray and I didn't realize those are not the same thing.

ibex10 · April 20, 2025, 2:21am

Please consider using different sounding symbols, to make your post easier to read.

[w of T], [w x T], [w * T]

austintatious · April 20, 2025, 3:07am

Interestingly enough, and I can’t be sure, but I think there is great poetry and a hidden point in using

instead of

as the example.

In the former, we at least see quickly that x shouldn’t be used as the operator/keyword as it is very likely used as a variable name making x x X kinda funny to read (yes it would probably be x x T). So better and easier would be x of X to read.

x also occupies this strange linguistic crossroads where it is a symbol (used in mathematical notation) but doesn’t occupy the Unicode range for math symbols that can be used as operators, instead it is a standard letter (or ASCII) that would make it a keyword in Swift. So would it be colored like a keyword or an operator in an IDE? Just an interesting question.

of doesn’t have this ambiguity and it is unlikely that of will be used as a variable.

ibex10 · April 20, 2025, 12:51pm

Going xless and ofless.

[m x T]
[m x [n x T]]

[m of T]
[m of [n of T]]

Eliding them yields something that looks even simpler.

[m T]
[m [n T]]

Would this be a good compromise?

austintatious · April 20, 2025, 2:12pm

I (kinda) advocate for this but there should be no requirement for the square brackets. Instead they should be ground by parentheses.

Unless there is a chance that number Type syntax could ever mean anything other than InlineArray<number, Type> and square brackets could disambiguate that.

My only opposition is that it’s odd not having an operator there. It feels like a divergence from Swift. You also lose some signals to the code reader. Seeing of really helps show what is going on.

In summary, if not of then no operator would be my next vote. As x is not viable.

Karl · April 20, 2025, 2:30pm

Not necessarily.

The fixed-size nature of the type is a semantic attribute. It allows me to guarantee that the Array has a particular number of elements (including guaranteeing that it is not empty, which has a huge impact on its usability as it tells me I can safely force-unwrap queries such as the first/last/min/max/etc elements). It might be useful for optimisations, but not necessarily.

COW-protected heap storage is completely orthogonal to the fixed-size nature of the array. It gives you very cheap copies, which we found to be vital to making value semantics work with Swift's implied copyability model.

You can imagine developers wanting to use these fixed-size array literals for their semantic meaning and getting burned quite easily by the way it makes copies so much more expensive.

For example, in general property accesses may be a lot more expensive:

final class HoldsSomeData {
  var someData: [100 x Float]
}

// -- (there may be a module boundary here) --

func processData(_ x: HoldsSomeData) {
  let data = x.someData // Copies 100 floats.
}

Returning values from functions also generally needs to copy.

func processData(_ x: HoldsSomeData) -> [100 x Float] {
  x.someData // Copies 100 floats.
}

// Also: imagine the [100 x Float] was hidden behind a typealias...

Sometimes optimiser heroics can eliminate these copies, but that doesn't provide me much reassurance. Somebody could modularise their code and accidentally introduce an inlining barrier which makes these copies expensive again, and even if we could fix all of them, we'd still need to consider the performance of debug builds.

The cost of these copies was considered to be so significant that the proposal authors withdrew InlineArray's Sequence conformance and compatibility with for loops. But now they're... insignificant? Not worth drawing attention to? Something we expect people to 'just know'?

If you're enough of an expert, no doubt you can work around all of these performance traps. But I have found that most Swift developers have a difficult time understanding the copying cost of large value types. This isn't unique to inline arrays, it's a general problem in the language, but inline arrays make it so easy to define enormous value types that it becomes even harder to spot unless you have extremely detailed knowledge of the standard library. Especially when you consider how close these look to regular Array literals, and how Array has an entirely different storage model.

toph42 · April 20, 2025, 7:47pm

Karl:

final class HoldsSomeData {
  var someData: [100 x Float]
}

// -- (there may be a module boundary here) --

func processData(_ x: HoldsSomeData) {
  let data = x.someData // Copies 100 floats.
}

I don’t understand this. I thought the whole point of Copy on Write was to allow this kind of operation to be “free.”

Nobody1707 · April 20, 2025, 8:04pm

But as he's saying, InlineArray doesn't do copy-on-write. It's like a big tuple, it copies eagerly. That's why he's advocating for a copy-on-write box to put the InlineArray in.

toph42 · April 20, 2025, 8:08pm

Yeah, I’m reading [Accepted with modifications] SE-0453: InlineArray (formerly: Vector, a fixed-size array) right now and learning that the Inline part of the type name holds more importance than the Array part, but perhaps doesn’t as effectively convey implicit knowledge of the type’s semantics to those of us without Computer Science degrees.

Karl · April 21, 2025, 2:52am

We're just talking about syntax here - this is a pure sugar proposal, and this community is full of experts who will know all the implications of changing your data's type from [Float] to [100 * Float]. But there are a lot of developers who won't make the connection that this fundamentally changes how you should handle variables of this type - the examples I gave previously all would be completely harmless if the data was typed [Float] rather than [100 * Float]. I want to make sure their interests are considered as well - we're not just looking for brevity, we also want a syntax that is clear in the context of our existing language.

So I'm not trying to advocate specifically that the fixed-size-in-a-COW-box necessarily needs a shorter spelling (even though I think it is a more useful construct than it has been given credit for). It just needs to be clear what you're getting.

I think @xwu's suggestion of |...| to denote inline storage is interesting; it would serve to indicate that this is a different kind of Array.

var x: [Float]          // regular, COW Array
var y: |[100 x Float]|  // inline array

I think that's not bad. I mean, I don't have any better suggestions. I appreciate that, as well as the newcomers, we do also want to appeal to experts with some sugar for inline arrays that is actually convenient. I think it strikes a good balance.

hisekaldma · April 21, 2025, 7:12am

Would using () instead of [] help make the inline nature of the array clearer?

var x: [Float]
var y: (100 x Float)

To me that looks like a large tuple, which it essentially is.

ibex10 · April 21, 2025, 7:28am

The presence of a size specifier clearly indicates an inline array type; there is no ambiguity. So, really, there is no need to complicate the syntax by putting on additional | decorations.

var x: [Float]       // regular, COW Array
var y: [100 x Float] // inline array

Also, |...| feels like the norm or absolute value of something.

michelf · April 21, 2025, 12:24pm

But why add () when it could just be this:

var x: [Float]
var y: 100 of Float

Nothing makes it feel more inline than not wrapping it in some kind of bracket or parens:

 // tuple
var z: (100 of Float, 50 of Double)

 // function signature
func foo(components: 16 of Int)

// generic arguments
var s: Set<16 of Int>

// mixed with existing collection sugar:
var d: [String: 16 of Int] // Dictionary<String, 16 of Int>
var a: [16 of Int] // Array<16 of Int>

I wonder if that last one will be controversial.

allevato · April 21, 2025, 12:51pm

I think there's a real risk here of not making a lot of progress because a lot of suggestions are just folks zoning in on their particular preferred syntax and then reasoning backwards from that, or by doing a random walk from one token to a different one, trying to find something that either looks good or doesn't have parsing ambiguities (or ideally both, of course).

Instead, we should think about what core design principles we want to uphold with this syntax, and reason forward from there. For example,

What, if any, characteristics should this sugar share with other type syntax sugar, and why?
What, if any, characteristics should this sugar not share with other type syntax sugar, and why?
How does this sugar relate back to the unsugared type InlineArray<let n, T>?

So if we look at it through those lenses, my thinking is:

There should absolutely be square bracket delimiters around whatever form is used to express the count-and-type product. InlineArray, like Array, is a (lowercase-c) collection of elements.
- Someone could counter-argue that because it's not a (capital-C) Collection, then it should not share the same delimiters. I wouldn't be convinced by that because I think the type sugar should represent the conceptual nature of the type, not its specific conformances. Furthermore, one day there may be a different hierarchy of collection protocols that InlineArray could conform to in some fashion.
For the same reason, using parentheses as the delimiter strikes me as a nonstarter. Parentheses are used to delimit tuples, and despite having a similar fixed size and layout, InlineArrays are not tuples—they are dynamically indexed and homogeneous. I would expect a syntax like (4 of Int) (using whatever connector word you prefer here) to produce (Int, Int, Int, Int), not InlineArray<4, Int>.

Within those constraints, an idea just occurred to me that I'm not sure is 100% serious or not, but I'll throw it out there for the sake of discussing something specific.

We introduced raw strings with # opening and closing delimiters. What if we made the claim that unlike a "well done" Array that can be grown and shrunk, an InlineArray is "raw" because it holds just N unprocessed elements and nothing more/less? Thus, if we extrapolate the raw string delimiters to array sugar, we could write

let x: #[4, Int]# = [1, 2, 3, 4]

and if we ever did want to support the sugar for repeated values, this would still support that:

let x = #[4, "some value"]#

I don't 100% like it, but I don't hate it either.

We could just do #[ on the opening delimiter and drop the closing # since we don't need to disambiguate regular embedded quotes from closing the string, but that loses a bit of the raw-literal symmetry, and doing that starts to look macro-ish instead.

xwu · April 21, 2025, 1:09pm

allevato:

Thus, if we extrapolate the raw string delimiters to array sugar, we could write
let x: #[4, Int]# = [1, 2, 3, 4]
and if we ever did want to support the sugar for repeated values, this would still support that:
let x = #[4, "some value"]#
I don't 100% like it, but I don't hate it either.

Very nice, in my opinion. For higher dimensions, would we need more #s?

allevato · April 21, 2025, 1:16pm

So, the natural nesting for multiple dimensions would get sort of ugly:

let matrix: #[2, #[3, Double]#]#

But the syntax #[n, T]# could extend naturally to multiple dimensions by just adding more commas, because the final comma-separated clause would unambiguously be the type (or the value, in the hypothetical value-repeating extension):

let matrix: #[2, 3, Double]#  // InlineArray<2, InlineArray<3, Double>>

There's an interesting appeal to the idea of ##[2, 3, Double]## because you'd be able to quickly scan it and see the number of dimensions based on the number of #, but since there's no syntactic reason to force multiple #, I don't think I'd go in that direction.

tera · April 21, 2025, 6:44pm

tera:

let x: [5]Int = [1, 2, 3, 4, 5]
let x: [_]Int = [1, 2, 3, 4, 5] // count inferred
let x: [5]_ = [1, 2, 3, 4, 5]   // type inferred
let x: [_]_ = [1, 2, 3, 4, 5]   // both type and count inferred

And multidimensional case nests nicely †:

[3][4][5]Int   // InlineArray<3, InlineArray<4, InlineArray<5, Int>>>

(† although that particular aspect could be a "no-goal").

I guess Swift parser could be thought to wait a bit and not consider [3][4] as a subscript operation on the array of one element (3) until it encounters more context (the type "Int" at the end).

Could this fly?

Nobody1707 · April 21, 2025, 6:50pm

This particular argument isn't very convincing to me. Homogenous tuples are already used for this exact concept today, the big problem with them is that they get slower to write and type-check the more elements you add. You can already write a dynamically indexed subscript to a homogenous tuple using withUnsafe{{Mutable}}Pointer(to:) as well, so that isn't particularly meaningful either. InlineArray is so close to being an homogenous tuple that earlier proposals for it actually wanted to just make it an easier to typecheck sugar over homogenous tuples, but were forced by ABI constraints to create a new type instead.

This isn't an argument against your #[ ]# syntax, just an argument against preemptively ruling parenthesis out.