Safe memory optimization: explicit stack allocation, static arrays, ascii

Doesn't this totally break abstraction?

This is one of many reasons that it’s considered bad practice to store game assets (textures, models, etc.) as individual files on the filesystem. One of the benefits of storing assets in an archive format is that the format can specify constraints for asset names—whether that’s a pre-agreed Unicode normalization scheme or even ASCII-only.

3 Likes

My files are stored using the Bundle feature of Swift packages, and I know all of them have ascii names.

Using [CChar] is painful, how about just fixing the problem like this:

let cString: [CChar] = "Hello, world!"

It works in C, why should Swift make it more difficult?

But still, this wouldn't be that useful for C interoperability because it's not a static array :eyes:
And what about C functions that want a C array? There's currently nothing to deal with this other than manually using tuples which is even worse

Ignoring the keywords and everything, just having a StaticArray<Type, Count> type with the memory layout of a C array should be enough. Not as nice to use, but at least the feature would be there. And since it's a very lightweight type it would likely be stored on the stack most of the time anyway.

Just because nobody has mentioned so far: A related discussion about fixed size arrays went on here.
So this is apparently an area people care about and things are (hopefully) done to improve that. stack may be too coarse and not fit the goal to get more predictably performing code, but the linked and several other threads give me the impression there's certainly work being done/planned to improve Swift in that area. Moveable types, etc. also come to mind.

1 Like

You’re imposing stricter requirements than the file systems which Bundle models. It’s not unreasonable that your code can make assumptions that can’t be reflected in the Bundle API.

What encoding does it use? There are several reasonable options, depending on the target triple and other build parameters:

  • UTF-8
  • ASCII
  • Codepage 437
  • Codepage 1252
  • UCS-2
  • UTF-16
  • Whatever the user’s locale is at runtime
  • Whatever the user’s locale is at build time
  • Whatever the compiler thinks the encoding of the source file is

The choice is implementation-defined in C.

The problem with “just make CChar work” is that C strings come from a time where encoding was defined by the platform. That world was long gone by the time Swift was invented. We’re getting close to a world where UTF-8 is the universal encoding, but we’re not quite there yet.

7 Likes

I think this specific point could be addressed by expression macros. Maybe something like this?

let cString: [CChar] = #asciiArray("Hello, world!")
2 Likes

I just found this:

let cString: ContiguousArray<CChar> = "Hello, world!".utf8CString

(it works without manually declaring the type too)

The Swift array still does reference counting when declaring this, so I'm not sure if it would help much over just using the string.

It seems like this issue:

I assume printing the cString would mean Swift can't put the array on the stack. Not printing it would have the compiler optimizing it out completely since it's not used. I'm not really sure when Swift would ever put an array on the stack then, since that would require not using functions apparently?

edit: would it work with inout?

Most of the time Rust can infer the lifetimes, I never found it that difficult.
From what I've read it used to require lifetimes everywhere.

Can we not still allocate xyz on stack even when parameter nonescapiness unspecified?


    let xyz = [x, y, z] // let's stack allocate it
    foo(xyz) // let's pass it as is

....

func foo(_ x: [Int]) {
    // case 1. we are not escaping it - nothing to worry about

    // case 2. we are escaping it:
    globalVariable = x 
    // at this point heap promotion machinery similar to COW converts x from stack to heap.
}

It seems to me the problem of fast lookup by string in a dictionary could be addressed with a mix of String and StaticString. When loading textures from disk you create a dictionary with String keys, but at lookup you use StaticString. To accommodate this, the actual key type would need to support two representations:

var textures: [FastStringKey: AnyObject] = [:]

enum FastStringKey: Hashable, ExpressibleByStringLiteral {
	case string(String)
	case staticString(StaticString)

	func withUTF8Buffer<R>(_ body: (UnsafeBufferPointer<UInt8>) -> R) -> R {
		switch self {
		case .string(var str): return str.withUTF8(body)
		case .staticString(let str): return str.withUTF8Buffer(body)
		}
	}

	static func == (a: StringKey, b: StringKey) -> Bool {
		a.withUTF8Buffer { a in
			b.withUTF8Buffer { b in
				a.count == b.count && memcmp(a.baseAddress, b.baseAddress, a.count) == 0
			}
		}
	}
	func hash(into hasher: inout Hasher) {
		withUTF8Buffer { bytes in
			for byte in bytes {
				byte.hash(into: &hasher)
			}
		}
	}

	init(_ string: String) {
		var string = string
		string.makeContiguousUTF8()
		self = .string(string)
	}
	init(stringLiteral value: StaticString) {
		self = .staticString(value)
	}
	init(unicodeScalarLiteral value: StaticString) {
		self = .staticString(value)
	}
	init(extendedGraphemeClusterLiteral value: StaticString) {
		self = .staticString(value)
	}
}

Lookups are done like this:

let texture = textures["hello"]

and there is no heap allocation nor reference counting in this operation because all you're storing in a StringKey created with a literal is a StaticString. It'll also compare strings byte-wise (UTF-8 code points) so it's fast.

2 Likes

Great idea. Although if you always using that textures dictionary with constant keys you may as well use them as textures[.hello] where:

enum TextureKey: Hashable {
    case hello
    case world
    ...
}

which avoids all complications above, is faster and at the same time is safer as compiler checks for typos (.hallo vs .hello)

1 Like

My understanding was that the goal was to avoid having to maintain such a list in code. But otherwise you're right: an enum would be perfect.

Yeah, if for example I were to add 100 textures, or remove a few of them, at some point I could miss something and the code would crash

This way there's a different potential issue, misspelling a string, but the error will happen inline so it's easier to catch

This would probably be easier with some kind of advanced macro that just generates the enum :slight_smile:

Speaking again from a practical gamedev standpoint, you’ll rarely see an explicit reference to a specific texture or resource in code. These things tend to be data-driven: technical artists use tools to define materials, which appear in the tools that environment artists use to make models, which appear in the tools that level designers use to make the game world.

4 Likes

I didn't realise this before, but the problem of easily declaring [CChar] can be solved with only a few lines of code by adding ExpressibleByStringLiteral conformance to an array of CChar

extension Array: ExpressibleByStringLiteral where Element == CChar {
    @inlinable
    @inline(__always)
    public init(stringLiteral value: StaticString) {
        precondition(value.isASCII, "You can only initialize a CChar array from an ASCII string.")
        self = Array<CChar>(unsafeUninitializedCapacity: value.utf8CodeUnitCount,
                     initializingWith: { buffer, initializedCount in

            var stringPointer = value.utf8Start

            for i in 0..<value.utf8CodeUnitCount {
                buffer[i] = CChar(stringPointer.pointee)
                stringPointer += 1
            }

            initializedCount = value.utf8CodeUnitCount
        })
    }
}

extension CChar: ExpressibleByUnicodeScalarLiteral {
    @inlinable
    @inline(__always)
    public init(unicodeScalarLiteral value: Character) {
        self = CChar(value.asciiValue!)
    }
}

extension Array: ExpressibleByUnicodeScalarLiteral where Element == CChar {
    //
}

extension Array: ExpressibleByExtendedGraphemeClusterLiteral where Element == CChar {
    //
}

This is of course a terrible implementation but it works :slightly_smiling_face:

I tested it on a markov chain, generating text into a buffer and printing out around 100 000 characters; after only replacing String with [CChar] everywhere it went from 0.8s to 0.04s, so it's a significant performance improvement.

Testing the code mentioned above with the Xcode memory profiling tools still shows a lot of calls to malloc(), even when all arrays ask for enough capacity ahead of time (if I remember correctly by something named ContiguousArrayStorage). At this point I'm really confused what is happening here. If everything is heap allocated anyway, why is String so much slower?

I will try to optimise this further, maybe I could pass an unsafe mutable pointer instead of using inout, and explicitly tell the compiler nothing is escaping with some 'strongly discouraged' attributes, but this

while slow { debug(); optimize() }

style of programming is really annoying. I want to write good code from the start, I don't want to fight the compiler all the time because of hidden memory allocation.
I feel like I learned way too much about the Swift compiler than what should be necessary, just to write somewhat efficient code.

I did see that there's plans to add "ownership" to Swift, is that going to solve the problem of having to repeatedly debug memory to understand what the code is doing?

Do you need (the power of) Array at all? Maybe you need just a fraction of what Array does?

Consider this simple fixed array implementation:

class FixedArray<T> {
    private let items: UnsafeMutablePointer<T>
    private let capacity: Int
    private (set) var count: Int

    init(capacity: Int) {
        precondition(capacity >= 0)
        self.capacity = capacity
        self.count = 0
        items = malloc(MemoryLayout<T>.size * capacity)!.assumingMemoryBound(to: T.self)
    }
    
    convenience init(repeating value: T, count: Int, capacity: Int) {
        precondition(count >= 0 && capacity >= 0 && count <= capacity)
        self.init(capacity: capacity)
        for _ in 0 ..< count {
            append(value)
        }
    }
    
    deinit {
        free(items)
    }
    
    func append(_ value: T) {
        precondition(count < capacity)
        count += 1
        self[count - 1] = value
    }
    
    subscript (index: Int) -> T {
        get {
            precondition(index >= 0 && index < count)
            return items[index]
        }
        set {
            precondition(index >= 0 && index < count)
            items[index] = newValue
        }
    }
}

Does your test program call into Foundation? I wonder if you’re paying a penalty to temporarily convert native UTF-8 Swift Strings into UTF-16 NSStrings.

I definitely do not need a dynamic array, which is why most languages have both, like the basic C array, std::array and std::vector in C++

I can write my own array implementation but it's not useful to me, it's still not a true C array.
If a language implicitly calls malloc() it's basically disqualified for writing real-time applications. It also relies on the operating system, not helpful if there isn't one.

@main
struct Main {
    static func main() {

        var buffer: [CChar] = [] ; buffer.reserveCapacity(101000)

        while buffer.count < 100000 {
            let text = generateText()
            buffer.append(contentsOf: text)
            buffer.append(" ")
        }
        buffer.append(0)
        print(String(cString: buffer))
    }
}

I was thinking it's allocating memory repeatedly because I'm returning the value then appending to the buffer, however passing a &buffer as a pointer, not inout, and appending characters as they're being generated into the array had no effect on performance.

print(String(cString: buffer)) is also not where the allocations are happening, writing the output manually likely wouldn't help much.

I am not importing Foundation.

Ideally the only malloc() call should be for the buffer itself, it's a large array; then I should be able to "borrow" a reference to it and use it without further allocations :slight_smile:

I have one idea left; how are constants stored in Swift? the tables for the chain are just a number of top level arrays.

let table: [Table] = [ ... ]

Does Swift only load constants before they're about to be used? That would have been really bad for performance, since it would constantly have to load and unload everything.
Would using a static struct help? are those always present?
Or do I have to create a struct instance and pass a reference to it with every single function call

The default implementation of operator new calls malloc.

@Andrew_Trick, is it possible in pure Swift to vend subranges of a large UnsafeMutableRawBufferPointer as heterogeneously-bound buffers?

I didn't say malloc itself is bad; I meant hidden, impossible to control use of malloc is generally considered bad for real-time code.

I don’t disagree. I’m just saying that C++ defaults to hidden allocations that (on Darwin, at least) take the malloc() lock. You do have control over that, but it’s rather cumbersome to add the allocator argument to every std::vector, and the language doesn’t do anything to prevent you from forgetting.