Fixed-size, scoped-storage array manifesto, Version 3

CTMacUser · March 10, 2020, 2:43am

After spending more months refining the idea, and seeing pop up occasionally here, I posted a new version of a plan to add fixed-sized arrays to Swift.

Version 2: "Another fixed-sized scoped-storage array manifesto spitball session"
Proto-version 1: "[Pitch] Array full proposal"

There's a bunch of links from the Version 2 post. Here's some more I missed, a couple from me, and some offered by the forum's suggestions.

Like my other versions, this version is part formal, part informal, and part rantings. It's a lot more focused on the user interface than the witness-table/SIL/IRGen/LLVM aspects. Those aspects can come later after we get a hold of the interface. Of course, I'd appreciate warnings about features that are not implementable.

I guess the work involved would be like the Swift TensorFlow stuff, which also did first-class compiler tweaks to the language. We just have a lot less people interested, and I'm still trying to grok how the compiler works.

Quick Tour

The syntax rips off from how Rust maps from flexible arrays to fixed-sized arrays.

let a: [3 ; Int]

I just switched the length from after the element type to before. (Without that, using nested arrays would have the declaration order be the reverse of the dereference order.)

Make multi-dimensional arrays by separating the extents with commas:

let a: [2, 2 ; Int]

You can initialize on declaration with an array literal:

let a: [3 ; Int] = [1, 2, 3]
let b: [2, 2 ; [2 ; Int]] = [[0, 0], [0, 1], [1, 0], [1, 1]]

Multi-dimensional arrays are flattened in row-major order. I decided to be cute and the above example shows the mapping. Note that the mapping is always in row-major order, even if the storage will be in some other order (probably overridden with an "@"-attribute).

If you don't initialize on declaration, and don't assign with a single expression, you initialize with a for-inout statement:

let a: [5, 7 ; Bool]
for e inout a {
    e = #indexOf(e).reduce(0, +).isMultiple(of: 2)
}

After initialization, for-inout statements can be used for read or read-write access to the elements. For reading only, the regular for-in also works. Use the #indexOf expression to get the index of the current element.

If you don't like counting terms, you can use a placeholder:

let a: [_ ; Int] = [1, 2, 3]
let b: [_, 2 ; Int] = [1, 2, 3, 4, 5, 6]

The compiler substitutes 3 for the missing extent in both cases. For multi-dimensional arrays, the number of terms has to be divisible by the product of the other extents. The placeholder extent technically declares a run-time sized array, but the compiler can elide that back to a compile-time sized array if the array doesn't escape its code block.

Placeholder extents can be used for struct and class instance-level stored properties (but not enum payloads). For a struct, all designated initializers and the code branches within must all agree on the final type (and the compiler just replaces the type). For a class, the final type can differ between designated initializers and the code branches within.

A run-time sized array can be used for a function argument (but not return type):

func readThis<T>(array: [_ ; T]) -> Int { /*...*/ }

Like the relation between U and U?, all compile-time sized arrays are subtypes of run-time sized arrays. When doing C compatibility, a run-time sized array works as the equivalent to both array segment (i.e. "int a[]") and C99 VLA arguments for functions and "extern" globals of unknown-sized arrays (which the homogenous tuple syntax can't do).

Placeholder can be used for declarations without giving an initial value:

let a: [_ ; Int]

But initialization can't take place without an intermediate step declaring the final size of the array:

reify a as [3 ; Int]
for e inout a { /*...*/ }

For code blocks and class, but not struct, the final type can use a runtime-based value and/or differ between code branches. It's an error to use a negative value.

There are grid array templates, which work like protocols:

func readSomeArray<T: [?.. ; some Any]>(_ array: T) { /*...*/ }

Templates that use a non-opaque type for their element type can act as existentials. You can declare a grid array as an opaque instance of a grid array template, but you must reify it before actually using it.

There's a lot more in the manifesto, including some questions on what to keep or work on.

Edit: Add working link to manifesto.

griotspeak · March 13, 2020, 9:33pm

This seems nice and well thought out. Thanks for all of the time and effort.

The placeholder idea seems like it would make for easy mistakes and awkward diagnostics but I will admit that I haven't used a system with the feature so maybe I'm just being pessimistic.