Approaches for fixed-size arrays

There has been so much movement on making the hard stuff talked about in this thread become real since it was started. ALL of these represent movement.

I'm starting on writing a FixedSizeCollection package for myself to make working with C comfy now, because in the future it will be able to take advantage of all this great work, which is the "compiler compatibility work", in all the proposal and pitches below. An ideation of what will it feel like to work with a fixed size collection when all of this is real. I think maybe this not the thread for that after all because it's giving the impression that less has already happened than has.

So I leave it with this bank of links:

1 Like
Side-track, I think the motivation for `x` went like this…
  1. We want multi-dimensional indexing to read in the same order as array shape.

  2. If we had it like var array: ((Int * 3) * 4) then confusingly the last element of the last triple would be found at array[3][2]!

  3. So it's better to place the multiplicity on the left-hand side like var array: (4 * (3 * Int)). Read it like "four triples of Int".

  4. And next, it would be convenient if we could omit the inner parentheses as var array: (4 * 3 * Int). But that would require the type-level * operator to be right associative, which is again confusing because the value-level * is left-associative.

  5. So that gets us to consider another operator. The × symbol would be cool, but it'd be the first standard non-ASCII character in Swift syntax, so… :man_shrugging:

4 Likes

Very insightful, thank you.

I'd say this is one more reason for the declaration to be: v1. var array: (Int[4][3]) then it will match the use site: array[3][2]. Or, alternatively to follow the current tuple syntax on the use site: "tuple.3.2" we could do the same on the declaration site: v2. "var tuple: (Int.4.3)"

Brainstorming homogeneous tuple declaration syntax further:

    var tuple: Int(4)    // v3. single dimension
    var tuple: Int(4)(3) // v4a. multidimensional case
    var tuple: Int(4, 3) // v4b. multidimensional case

As you mentioned this looks and feels like an "operator".... "x" would be the first "operator" which doesn't follow the operator naming rules!

1 Like

Since we're bikesheding anyway, I prefer the Rust syntax of [Type; dimensions] e.g. [Int; 6], [Float; 4, 4], etc.

5 Likes

That of course wasn't very useful. This however could be actually used in practice:

typealias X1234<T> = (X1000<T>, X200<T>, X30<T>, X4<T>)

to represent 1234 element array.

A minimal sketch implementation of the FixedArray type:

struct FixedArray<Element, Zone> {
    typealias Index = Int
    private var zone: Zone?
    static var count: Int {
        precondition(MemoryLayout<Zone>.size >= MemoryLayout<Element>.size)
        // NOTE: a naïve MemoryLayout<Zone>.size / MemoryLayout<Element>.stride is not correct.
        return 1 + (MemoryLayout<Zone>.size - MemoryLayout<Element>.size) / MemoryLayout<Element>.stride
    }
    var count: Int { Self.count }
    
    init(repeating element: Element, in _: Zone.Type) {
        precondition(_isPOD(Element.self))
        precondition(_isPOD(Zone.self))
        // hack ahead: this magically makes zone to be `.some`
        withUnsafeMutableBytes(of: &zone) { p in
            _ = memset(p.baseAddress!, 0, MemoryLayout<Zone?>.size)
        }
        precondition(zone != nil, "redo the above hack")
        for i in 0 ..< count {
            self[i] = element
        }
    }
    
    subscript(_ index: Index) -> Element {
        get {
            precondition(index >= 0 && index < count)
            return withUnsafeBytes(of: zone) {
                $0.baseAddress!.assumingMemoryBound(to: Element.self)[index]
            }
        }
        set {
            precondition(index >= 0 && index < count)
            withUnsafeMutableBytes(of: &zone) {
                $0.baseAddress!.assumingMemoryBound(to: Element.self)[index] = newValue
            }
        }
    }
}
Some internals
struct  X1<T> { var x: (T) }
struct  X2<T> { var x: (T, T) }
struct  X3<T> { var x: (T, T, T) }
struct  X4<T> { var x: (T, T, T, T) }
struct  X5<T> { var x: (T, T, T, T, T) }
struct  X6<T> { var x: (T, T, T, T, T, T) }
struct  X7<T> { var x: (T, T, T, T, T, T, T) }
struct  X8<T> { var x: (T, T, T, T, T, T, T, T) }
struct  X9<T> { var x: (T, T, T, T, T, T, T, T, T) }
struct X10<T> { var x: (T, T, T, T, T, T, T, T, T, T) }

typealias   X20<T> =   X2<X10<T>>
typealias   X30<T> =   X3<X10<T>>
typealias   X40<T> =   X4<X10<T>>
typealias   X50<T> =   X5<X10<T>>
typealias   X60<T> =   X6<X10<T>>
typealias   X70<T> =   X7<X10<T>>
typealias   X80<T> =   X8<X10<T>>
typealias   X90<T> =   X9<X10<T>>
typealias  X100<T> =  X10<X10<T>>
typealias  X200<T> =  X2<X100<T>>
typealias  X300<T> =  X3<X100<T>>
typealias  X400<T> =  X4<X100<T>>
typealias  X500<T> =  X5<X100<T>>
typealias  X600<T> =  X6<X100<T>>
typealias  X700<T> =  X7<X100<T>>
typealias  X800<T> =  X8<X100<T>>
typealias  X900<T> =  X9<X100<T>>
typealias X1000<T> = X10<X100<T>>

Usage example:

// Usage example:

struct S {
    var x: Int16 = 1111
    var y: UInt8 = 22
}

let elementCount = 1243
typealias X1243<T> = (X1000<T>, X200<T>, X40<T>, X3<T>)
var array = FixedArray(repeating: S(), in: X1243<S>.self)
print(MemoryLayout<S>.size, MemoryLayout<S>.stride, MemoryLayout<S>.alignment) // 3 4 2
let expectedZoneSize = (elementCount - 1) * MemoryLayout<S>.stride + MemoryLayout<S>.size
precondition(MemoryLayout<X1243<S>>.size == expectedZoneSize)
precondition(MemoryLayout<FixedArray<S, X1243<S>>>.size == expectedZoneSize + 1)
precondition(array.count == elementCount)
print(array[elementCount - 1]) // S(x: 1111, y: 22)
array[elementCount - 1] = S(x: 3333, y: 44)
print(array[elementCount - 1]) // S(x: 3333, y: 44)

The size of FixedArray is one byte more than the size of the actual payload, this is because I'm using an optional:

    private var zone: Zone?

I haven't figured out how to not have this optional (and thus the extra byte) in an ergonomic way.


Edited the code above.

1 Like

Found this ingenious hack to materialise simple things "out of thin air" – e.g. it could be used in the above code to keep the fixed array size equal to its payload size without adding an extra byte for optionality. Highly dangerous code so use on your own judgement.

// DO NOT USE IN PRODUCTION!
// This hack assumes:
// - "simple" (POD) value type
// - the size of optional<thing> is one byte more than the size of the thing
//   (won't work with underpopulated enums for example).
// - the optional marker is at the end
// - it is 0x01 to represent `none` and 0x00 to represent `some`
// - "var thing: Thing?" has <00 00 ... 00 01> bit pattern initially
// - the "all zeroes" bit pattern makes a valid thing
func makeSimpleThingOutOfThinAir<Thing>() -> Thing {
    precondition(MemoryLayout<Thing?>.size == MemoryLayout<Thing>.size + 1)
    var thing: Thing?
    withUnsafeMutableBytes(of: &thing) { p in
        precondition( p[p.count - 1] == 1)
        p[p.count - 1] = 0
    }
    precondition(thing != nil)
    return thing!
}

Usage example:

struct FixedArray<Payload> {
    private var payload: Payload
    init() {
        payload = makeSimpleThingOutOfThinAir()
    }
    ....
}

I enhanced the code of the previous post to support arbitrary value types (they should be POD's still). The usage is a bit cleaner as you could specify a default value (for the initial array contents):

typealias X1243<T> = (X1000<T>, X200<T>, X40<T>, X3<T>)
var array = FixedArray(repeating: S(), in: X1243<S>.self)

This will create an array of 1243 x S() elements.

Note that in this implementation arrays can't shrink or grow (within the fixed limit), this is easy to do (add a separate count variable and rename the previous count to capacity).