Approaches for fixed-size arrays

It would be unfortunate to couple those two together, as they are orthogonal. e.g. I might want a fixed-size array with a billion elements in it. That's not going to fit on the stack.

I might also want a fixed-size Set, or tree, or any other data structure. The ability to specify a fixed size for a collection should be universal, not special-cased to some specific data structure (e.g. a thin shim over alloca in your implication - which is a valid and useful case as well, just not the only one).

4 Likes

Indeed. The current threshold happens to be 4096 items.

One of the benefits of "inline" data structures (that don't allocate memory or take locks) is that we'll be able using them in realtime or other performance critical contexts... I agree that "fixed" in FixedArray doesn't carry that important meaning. Maybe FixedArray is not the right name then... InlineArray?

Ideal world - yes. In reality I'd tolerate having just InlineArray that's known to "be a truly inline data type" even if it is small (up to a certain prescribed limit like 4K) and/or if it is the only container thing available of that kind. However I agree that Set of Bool or Set of enum constants is another important use case worth optimising.

1 Like

My current posture is that this FixedSizedCollection is intended to be an API on a bag of bytes service that will be able make a solid commitment to compiler that those bytes all exist and have a value of type Element and the size of that memory will not change for the lifetime of that FixedSizedCollection. That BytesService could be owned and managed by the FixedArrayCollection, or by something reliable a client offers up on initialization (a BufferView, e.g,).

The FixedSizedCollection itself can then vend Views or Copies (stack or guaranteed deallocated at the end closures for a stack-like experience for the bigger bags of bytes) as needed to its full self or SubSequences .

If it ends up being better as a Protocol. That's doable too.

Something that a truly FixedSizedCollection will need to deal with that simply an inline view would not necessarily is... Default Values.

A fixed size array needs to know what value to use if it's stuck with a location that it's been told to create (on init only) or clear without instruction as to a new value. Should that value simply be a required parameter on any function that could result in the ambiguity or a stored value for the instance? Or should the default value be a product of the associated type instead (zero for things that have one, nil for optionals...)

Sugar is completely made up, please offer preferred alternatives

//Comes with type
[Int](7) //  => a FSC of 7 .zero
var myFSC:[Int](7) = [3,2,1]. //=>  [3,2,1,0,0,0,0]
[Int?](7) // => to a FSC of 7 nil
[MyType](7) //=> leads to 7 what? Is there a protocol Elements need to conform to? 

//set at init
var myFSC:[Int](7, default: 5) = [3,2,1] // =>  [3,2,1,5,5,5,5]
myFSC.insert(12) => [3,2,1,12,5,5,5]
myFSC.clear() => [5,5,5,5,5,5,5]

//require per function
var myFSC:[Int](7, default: 5) = [3,2,1] // =>  [3,2,1,5,5,5,5]
myFSC.insert(12, atFirstLocationOf: 5) //=> [3,2,1,12,5,5,5]
myFSC.setAll(to:5) //=> [5,5,5,5,5,5,5]

Maybe this?

var x: Int[7] = [0 ...]
var x: Int[7] = [3, 2, 1, 0 ...]
var x: Int[] = [3, 2, 1] // †

† Two options about the "Int[]" syntax notation:

  1. the type is inferred as Int[3] based on the subsequent initializing expression.
    (this is how "int x[] = {1, 2, 3};" is inferred to be "int x[3]" in C).

  2. this notation could be used as a synonym for the "normal" [Int] array.


Loosely following a semi-similar syntax for variable argument functions:

func foo(_ firstParam: Int ...) {
}

Speaking of which, the variable argument parameter list could be passed as the fixed array as well.

2 Likes

Succinct! I like it and added it to the README.md which should be pushed again soon, I think it's relatively clear in the first one that 0 is the implied default, but in the seconds case I read that as a request for a pattern fill is that what you meant?

1 Like

No, just repeat the last element:

var x: Int[7] = [3, 2, 1, 0 ...]
// same as
var x: Int[7] = [3, 2, 1, 0, 0, 0, 0]

Similar to how in the:

func foo(x: Int, y: Double ...) {  }

the "..." is applied to the last "Double" parameter only.

Also see my edit about the "Int[]" notation.

Gotcha. Easier to implement, too.

You had expressed the same concerns earlier in the thread and replaced your pitch for Int[ ] with Int[ _ ] so I had left out just Int [ ], I'll put it in, but with the footnote for the record since it has come up again.

Ah, I already forgot about that :slight_smile:. So we do have few notations at our disposal:

var x: Int[4] = [1, 2, 2, 2]
var x: Int[4] = [1, 2 ...]   // same
var x: Int[_] = [1, 2, 2, 2] // same
var x: Int[_] = [1, 2 ...]   // 🛑

var x: [Int]
var x: Int[] // same? or we could keep this for the future
1 Like

It'd be good to find a different syntax for that, than the ellipsis. In mathematics an ellipsis used like that typically indicates "and so on", i.e. following the pattern established by the values so far. Though some people apparently use it to mean "etc" in the sense of "there are more values [but they could be anything]".

And in Swift 0... currently means a half-open range of all integers starting from zero and counting up.

It could actually be convenient to have the "and so on" functionality in Swift. However, I worry that (a) it's probably intractably difficult for the compiler to correctly deduce every possible sequence's pattern, and (b) I worry about the potential abuse of it where a Range or stride(from:to:by:) would be more appropriate (at least for performance).

For "repeat this for all remaining entries"… the conventional mathematical symbols for related concepts are things like the vinculum or a dot above the repeating element. Wikipedia says there's other conventions in other locales. Alas none of these are readily available on an 'ASCII' keyboard, so I expect they'll be rejected without consideration from Swift syntax.

Perhaps something like:

var x: Int[7] = [3, 2, 1, rest: 0]
3 Likes

That's good too. Added. The collection currently has two variadic initializers:

  public init(defaultsTo d: Element? = nil, _ values:Element...) 
   public init(_ count: Int, defaultsTo d: Element, _ values:Element...) 

that can lead to

//breifest
FixedSizeCollection(1,2,3,4) // = > count:4
                                defaultTo:TBD (throws when needs a default or .zero)
                                  values: [1,2,3,4]
//most verbose
FixedSizeCollection<Int?>(5, defaultsTo: nil, 1,2,3)// = > count:5
                                                       defaultTo: Element's nil
                                                          values: [1,2,3,nil, nil]

A) what feedback on those place holder initializers to you have to make them not clash with this possible future? /make them more ergonomic on their own.
B) Do you think having values after the variadic is too prone to be missed if the variadic list is too long?

1 Like

Putting the 'filler' before the explicit values - and calling it "default" - isn't as clear as it could otherwise be, because it's not so clear that it means to pad the array to the end. e.g. a reader might assume it pads the array from the front, given the argument order.

Agreed, actually. I like this "The Rest" vibe a lot. Do you think though a "the rest" provides the right tone for "and now this gets used by this variable as the default forever?" (the answer of course can be yes)

The syntax sugar question is both fun and important but there's a real structure of the type fundamental here, too:

A) the type has no stored default value and every method that could end up needing one must have it as a parameter.
B) default value is client defined at init, and client must define it explicitly in any init
C) client fails to define it and parts of the API won't work for that instance
- (throw if called, and therefore throws must be caught even by folks who did define the default. )
- those functions are uncallable to those instances? can that be done?
D) client fails to define it and the initializer infers it from the type, or returns a nil init
E) client fails to define it and the initializer returns a nil init, no implicit defaults because that could be confusing. this is kind of B

I'm leaning towards A or D

EDIT: Have changed the inits from _ values: to just values: to be less muddy in the meantime.

Ah, if the intent is to fill in values beyond just initialisation - e.g. mimicking some kind of sparse array with an automatic default for 'missing' entries - then indeed "default" or somesuch terminology makes more sense, than "[the] rest".

I glanced over your implementation and AFAIU you are using a tuple for array storage, so Array2 has 2 elements, Array3 has 3 elements and so on. Is Array0 and Array1 possible?

How do I actually use it, preferably without specifying the exact array-count-suffix?

Could be something like this?

func FixedArray<T>(_ x0: T, _ x1: T) -> Array2 { ... }
func FixedArray<T>(_ x0: T, _ x1: T, _ x2: T) -> Array3 { ... }

to get to this use sites:

let x = FixedArray(1, 2)    // inferred as Array2
let y = FixedArray(1, 2, 3) // inferred as Array3

It's not exactly an init (FixedArray.init(1, 2) spelling would fail), but quite close.

could this work somehow?

let x: FixedArray = [1, 2]    // the type is inferred as Array2
let x: FixedArray = [1, 2, 3] // the type is inferred as Array3

Do you think it's ok to generate, say, 10K of all those ArrayX between 0 and 10K and let (or hope) linker stripping the unused ones?


I think going forward we'd need these things to move things forward:

  • an actual pitch
  • compiler and/or standard library support as part of the pitch proposal

Realistically it seems we are still quite far from that, e.g. people having different ideas what FixedArray should be (a fixed version of Array of arbitrary size, an inline limited size data structure, etc).

My storage is Data, for the moment. And all the initializers require a pass through an array so I can feel comfortable I can handle their layouts the way I need to.

FixedArray(1,2...whatever) is a FixedArray that has a _Storage type that's a blob of UInt8s that I loaded in with a memory layout that I can predict. The count is determined on init and then it will never change the number of items or the amount of data it uses. I love a bag of bytes.

Tuples have a file called TupleLove.swift at the moment and I can COPY them in from C and COPY out to them back to C, because no one should trust a random C pointer to be there for them always, but when StorageView gets here, that could change things.*

I agree this is not yet pitch ready, but I need this for myself, so I will keep moving :)

(*ETA for clarity: If you want to pass C a UnsafePointer reference... all of withUnsafe{} closures have been implemented already as well)

That could be one possible application, yes. But is could be a regular dense array that can be wiped, or wiped at sub ranges.

But that may not be a valued feature and therefore not worth storing the default value, which is why I ask :)

I like that a lot! I'd like to add a fun fact: Int[] used to be the array syntax before [Int] was introduced to replace it. Funny how your suggestion brought it back full circle!

I agree that the [1, 2, 3, 0...] syntax is both intuitive and not but the suggestions made afterwards make a lot of sense!

Do we want to have a limit on the number of elements? Using 123456 below as a placeholder for the limit if such exists:

    func foo() {
        var x: UInt8[123456]     // ✅
        var y: UInt8[123456 + 1] // 🛑 Too many elements
    func foo() {
        var x: UInt8[123456]     // ✅
        var y: UInt8[123456 + 1] // ✅
        var z: UInt8[Int.max]    // ✅ but will likely crash or throw at runtime when actually used

Or, if we have a limit, maybe that's a limit on the memory being used?

    struct S { /* a struct of 100 bytes */ }
    func foo() {
        var x: S[1234]           // ✅
        var y: S[1234 + 1]       // 🛑 Too much memory
    func foo() {
        var x: S[1234]           // ✅
        var y: S[1234 + 1]       // ✅
        var z: S[Int.max]        // ✅ but will likely crash or throw at runtime when actually used

I am biased towards (3).

With all my respect for all the ideas expressed in this thread, I suggest a return to the fundamental motivations expressed by Joe_Groff and the primary need for interop with C.

  • Don't start to wide, focus on FixedArray for C interop to avoid the actual problems with Tuples.
  • Keep it simple. Simple is beautiful
  • Often less is better than more
  • Don't re-invent the wheel
  • Look at how Fixed Array specified in other languages
  • Find common denominator
  • I think C# Array is a good starting point.

Thanks

2 Likes

That's good although that version was pitched 3 years ago and given it's not in Swift now, something is still missing.

(Personally I like that pitch except for one little thing: the syntax chosen (5 x Int) – I'd stick to the established precedent of using "*" and put the arguments in the opposite order (Int * 5).

1 Like