Approaches for fixed-size arrays

Reminded me of an ugly way of parametrising generics with numbers.

You can't do this directly in current swift (pseudocode):

struct S<n: Int, T> { ... }

But if you need it badly...

protocol Bit {
    static var val: Int { get }
}
struct O: Bit {
    static var val: Int { 0 }
}
struct I: Bit {
    static var val: Int { 1 }
}

struct Size<BF:Bit, BE:Bit, BD:Bit, BC:Bit, BB:Bit, BA:Bit, B9:Bit, B8:Bit, B7:Bit, B6:Bit, B5:Bit, B4:Bit, B3:Bit, B2:Bit, B1:Bit, B0:Bit> {
    static var val: Int {
        // too complex expression for the compiler, let's split it into several.
        let w1 = B0.val <<  0 + B1.val <<  1 + B2.val <<  2 + B3.val <<  3
        let w2 = B4.val <<  4 + B5.val <<  5 + B6.val <<  6 + B7.val <<  7
        let w3 = B8.val <<  8 + B9.val <<  9 + BA.val << 10 + BB.val << 11
        let w4 = BC.val << 12 + BD.val << 13 + BE.val << 14 + BF.val << 15
        return w1 + w2 + w3 + w4
    }
    var val: Int { Self.val }
}

Usage example:

let type = Size<O,O,O,I,  O,O,I,O,  O,O,I,I,  O,I,O,O>()
print(String(format: "0x%0X", type.val)) // 0x1234

Extending this to 64 bit numbers is left as an exercise to the reader :rofl:

6 Likes

Shoot, lol, this is pretty genius! Remembered that we used nested types to represent fixed sized arrays, like ArrayN<ArrayN<ArrayN<Array1>>> for a 3-D array...

Seriously though, how do you get the type to be statically stored? (if val is the desired size).

Finally, I really wanted to see a real, usable fixed sized array in Swift!

1 Like

I recently found fixed-size arrays as a useful counterpart when playing around the ownership system. In Swift, typical Arrays are not considered as trivially destroyable, so we cannot use discard self when a non-copyable type holds an array. Using a fixed-capacity (more specially a fixed-size one) array should solve the problem.

For me nominal type using new generics features approach seems the best in aspects of ergonomics, compiler-checks safety and usfullness for other types:

  • compile-time bounds check for cases like let array: FixedArray<Int, 5>array[4] is valid and array[5] is a compiler error.
    BufferView has no such compile-time checks if I understand the idea correctly.
  • extension FixedArray where n >= 1 opens the door to NonEmpty collections.
    Personally I use NonEmpty often in current projects (along with SwiftCollections and SwiftAlgorithms libraries), which helps to make API better, implementations cleaner and simpler and elide some problems at all.
  • Matrix<Int, 1024, 1024, 3> can be useful in areas like ML or 3D. In case for ML I think we need more a fixed-capacity array rather than a fixed-size one. But array size visible in source code can be also helpful though.
  • might be useful in embedded Swift if it is proven to be stack allocated. But a workaround is needed for preventing unnecessary copies made automatically by compiler, like passing by reference / &inout / borrowing.

I see at least two general ways to initialize FixedArray:

  • FixedArray<String, 4>(repeating: "")
  • FixedArray<String, 4> { uninitializedBuffer in ... }

It might be useful to convert Array to FixedArray:

if array.count == 4 {
  let fixedArray: FixedArray<String, 4> = .init(array) 
  // statically proven that size is 4
}

It might be useful to convert C tuples to FixedArray.
May be to wrap C tuples by something like FixedArray.View / FixedArrayBuffer and keep original C tuple as underlying buffer. In this case C-array-tuple elements count is needed to be visible and representable in Swift, making Tuples as fixed-size arrays approach also useful but for other purposes.

3 Likes

Any progress on FixedArray?
Coming to Swift soon?

1 Like

Here's a sketch implementation that shows how we could provide "subscript" type operation on homogeneous tuples today:

func tupleGetElement_viaUnsafe<T>(_ tuple: UnsafeMutableRawPointer, at index: Int) -> T {
    tuple.withMemoryRebound(to: T.self, capacity: index + 1) { pointer in
        pointer[index]
    }
}
func tupleSetElement_viaUnsafe<T>(_ tuple: UnsafeMutableRawPointer, at index: Int, value: T) {
    tuple.withMemoryRebound(to: T.self, capacity: index + 1) { pointer in
        pointer[index] = value
    }
}
var x: (Int, Int, Int) = (11, 22, 33)
print(tupleGetElement_viaUnsafe(&x, at: 2) as Int) // 33
tupleSetElement_viaUnsafe(&x, at: 2, value: 333)
print(tupleGetElement_viaUnsafe(&x, at: 2) as Int) // 333

However it is very unsafe on different levels.


From my POV until we have a way to type and initialise big homogeneous tuples other than:

var x: (Int, Int, Int) = (11, 22, 33)

which is non scaleable we can't use big tuples in Swift as a foundation for "fixed size arrays". 256 tuple elements is probably tolerable, but thousands is very non-ergonomic and too slow to compile.

Something like this could work syntax wise:

var x: (Int, count: 1000) = .init(repeating: 0)
var x: (1000*Int) = .init(repeating: 0)
var x: (Int*1000) = .init(repeating: 0)

What's the current sentiment?

  1. we don't need fixed arrays in Swift
  2. we could extend tuples to be usable as fixed arrays
  3. we'd rather do fixed arrays without tuples

I'd say (2) or (3) but not sure which is the best.

1 Like

I think fixed arrays should stand on their own. My use-cases always go beyond tuple capabilities, usually going in the thousands.

Since the beginner's point of view is often brought up:
You get introduced to tuples as heterogenous, like a lightweight anonymous struct. I feel like introducing the (completely different) concept of fixed homogenous tuples is no harder than introducing the (also completely different) concept of fixed arrays.

I also don't see why a new syntax would be required, the C syntax Int[1024] would work just fine. In case people get mixed up and write Int[], the fix-it would suggest using a regular array or fixing its size. Currently this is the diagnostic: "Array types are now written with the brackets around the element type".

I don't think we'll ever want to support flexible arrays (or if we do, provide such a shortcut for such a niche use-case) so I don't think it'll be ambiguous, since fixed arrays won't be mutable collections and things like myFixedArray.append(5) will error.

2 Likes

how would you initialize such an array, if append is unavailable?

Good point, I forgot Swift doesn't have a concept of default values, I think tera's .init(repeating: 0) would be a good starting point.

Oh wait I think I see where I wasn't clear.

I said they weren't mutable but what I meant is that they can be modified via myFixedArray[5] = /*...*/ but that it doesn't make sense to append or insert because the size is fixed.

A fixed capacity list can easily be built on a fixed size array, that one would support doing appends or inserts.

I think Swift's Array should've been called a list because its size can grow.

"List" in some popular languages is a linked list, which is very different from Swift's Array.

I can't recall any language where "array" specifically means a fixed-size array…?

C, Java, Rust, etc. Though there's also Swift's value semantics to consider, since unlike Java for example, appending an element to an array is semantically equivalent to creating an entirely new array with the added element.

3 Likes

Arrays aren't necessarily fixed size in C, e.g. char foo[] as a parameter or the last member of a struct. And they can be resized at runtime, it's just clumsy (e.g. realloc).

I guess it depends on what you consider 'intrinsically resizable'.

1 Like

Look at C# Array vs List.

Swift Array is like C# List.
I hope Swift will have also Fixed Array like C#.
When you interop with C and C++, FixedArray is required.

Some examples of inits from top performers in that Benchmark repo

C:         int cache[10];
C++:       int cache[10];
fortran:   integer, dimension(10) :: cache
Go:        [10]int
Java.      new int[10];
Julia:     ntuple(i -> i^i, 9)...  # Tuple{Vararg{T,N}}
Kotlin:    Array(10) { }
nim:       array[10, int]
Ruby:      [i32; 10]

Scheme:

  (list->vector
    (cons
      0
      (let loop ((i 1))
        (if (> i 10)
          '()
          (cons (expt i i)
                (loop (+ i 1))))))))

Would it make sense that a fixed width array that isn't fully defined on init is really an array of optional? So

var myArray:[Int, 10]  = [0,1,2] //or [Int](10) 

Would really be an [Int?] for the missing values

But

var myArray:[Int](10)  = Array(repeating: 10, count: 10)

Would be an [Int]

Or would it be better just to require initial values? [Int](10, default:0)

1 Like

For denoting n values of type T, [n T] would be easier to read than [T, n], especially for higher dimensions :slight_smile:

For example:

// 5 Ints
var v:[5 Int]  = ...
var w:[Int, 5] = ...

// 3 rows of 5 Ints
var v:[3 [5 Int]]   = ...
var w:[[Int, 5], 3] = ...

// 3 rows of [5 [7 Int]]
var v:[3 [5 [7 Int]]]    = ...
var w:[[[Int, 7], 5], 3] = ...
1 Like

Whelp, in the mean time here's a FixedSizeArray(4, default: 12) { [1, 2, 3] }

Very naive implementation.

Subscript is unchecked.

TODO: Add Tuple-bridge? UnsafeWrapCSampler/Sources/Swift/TupleBridge.swift at main · carlynorama/UnsafeWrapCSampler · GitHub

//
//  FixedWidth.swift
//  TestingTwo
//
//  Created by Carlyn Maw on 2/2/24.
//  Inspired by 25:52 of WWDC 2020 "Safely Manage Pointers in Swift."

import Foundation

struct FixedSizeArray<Element> : RandomAccessCollection {
    let count:Int
    var startIndex: Int { 0 }
    var endIndex: Int { count }
    
    //What is the best storage type? Data is really easy to work with.
    private var dataBlob:Data
    
}
extension FixedSizeArray {
    init(_ count:Int, default d:Element, initializer:() -> [Element] = { [] }) {
        self.count = count
        var result = initializer().prefix(count)
        //if result.count > count { return nil }
        for _ in 0...(count - result.count) {
            result.append(d)
        }
        self.dataBlob = result.withUnsafeMutableBufferPointer { pointer in
            Data(buffer: pointer)
        }
    }
    
    init(dataBlob: Data, as: Element.Type) {
        self.dataBlob = dataBlob
        self.count = dataBlob.withUnsafeBytes { bytes in
            let tmpCount = bytes.count / MemoryLayout<Element>.stride
            precondition(tmpCount * MemoryLayout<Element>.stride == bytes.count)
            precondition(Int(bitPattern: bytes.baseAddress).isMultiple(of: MemoryLayout<Element>.alignment))
            return tmpCount
        }
    }
}

extension FixedSizeArray {
    subscript(position: Int) -> Element {
        get {
            dataBlob.withUnsafeBytes { rawPointer in
                let bufferPointer = rawPointer.assumingMemoryBound(to: Element.self)
                return bufferPointer[position]
            }
        }
        set {
            let startIndex = dataBlob.startIndex + position * MemoryLayout<Element>.stride
            let endIndex = startIndex + MemoryLayout<Element>.stride
            withUnsafePointer(to: newValue) { sourceValuePointer in
                dataBlob.replaceSubrange(startIndex..<endIndex, with: sourceValuePointer, count: MemoryLayout<Element>.stride)
            }
        }
    }
    
    var all:[Element] {
        loadFixedSizeCArray(source: dataBlob, ofType: Element.self) ?? []
    }
    
    //Okay to use assumingMemoryBound here IF using type ACTUALLY bound to.
    //Else see UnsafeBufferView struct example using .loadBytes to recast read values
    private func fetchFixedSizeCArray<T, R>(source:T, boundToType:R.Type) -> [R] {
        withUnsafeBytes(of: source) { (rawPointer) -> [R] in
            let bufferPointer = rawPointer.assumingMemoryBound(to: boundToType)
            return [R](bufferPointer)
        }
    }
    
    //TODO: Test non-numerics
    private func loadFixedSizeCArray<T, R>(source:T, ofType:R.Type) -> [R]? {
        withUnsafeBytes(of: source) { (rawPointer) -> [R]? in
            rawPointer.baseAddress?.load(as: [R].self)
        }
    }
}
1 Like

Will be working on summarizing this thread here:

Let me know what I miss.

1 Like

I'm pretty sure Data uses an allocation under the hood (like any other implementation I've seen which doesn't rely on tuples). The whole point of fixed arrays (at least for me) is to ensure the elements are part of their scope's memory.

That means either the stack if it's a function parameter or local; embedded into a struct/class if it's a member; or static program memory if it's a global.

An example use-case is my voxel game which uses an ECS, all components store their data on globals and many of those components want to embed 32768 elements since that's the chunk size. Since the whole point is to have all that laid out in memory for cache-friendly access, having any indirection in a component defeats the purpose.

It could also easily replace ManagedBuffer which would be a win in and of itself in my humble opinion.

Interop would of course be happy, since at the moment large arrays aren't imported from clang targets and you need to write C functions to access them.

Also I'm pretty sure SIMD is an Apple framework? Fixed arrays could easily be used and optimized as vector types in a platform-agnostic way.

Completely agreed. And honestly ensuring that's done correctly is probably out of my league. The real work is being done over on BufferView.

What I probably can do with my current knowledge is write a library that looks on the surface as much as possible like a bonafide Swift Collection with a prototype spelling and UI for the community to react to that which can have all its underlying guts ripped out and replaced by more experienced hands. I'm willing to give that a crack if no one else has the time, but I'd need guidance because looking at the Array code... whew, there is a lot of stuff in there that I have never seen before. (And TDB for me if FixedSizedArray needs to be allocated and unallocated or just unallocated and essentially just BufferView)

It's not entirely altruistic, I want an ergonomic way to handing fixed sized C arrays for myself. And if this isn't useful when I get there I'll stop :)

Thank you for the other the examples. I've added them to the TODO.md under a new section called NonC Interop Targets, to keep my eye on those as well. Anyone else want to get things on that list or provide code examples that are currently painful that they'd like to be easier I will absolutely add those too.

Speaking of things to react to:

Default Values - A fixed size array needs to know what value to use if its stuck with a location that its been told to create or clear without instruction as to a new value. Should that value simply be a required parameter on any function that could result in the ambiguity or a stored value for the instance? Or should the default value be a product of the associated type instead (zero for things that have one, nil for optionals...)

Edit: If using Data alone even while prototyping is going to to be a distraction I can write a custom storage type based around an OpaquePointer that I've done for my own stuff when I know that my swift code owns the allocation, but its not safe to go around trusting random pointers people hand you and that's the point of BufferView so I was going to wait... or I could just have it backed... by a C array... instead of Data. But I understood that new stuff for swift should be written in Swift.

Not extensively tested etc. I’ve since been working on and using a (unreleased) StaticArray library; FixedSizeArray+RangeReplaceableCollection basically.

1 Like