I am playing with a tiny hand-made neural network and keep wondering could I use value types instead of classes in it. At the moment it is a graph of reference type nodes which are wrapped around Double with a few extra bits. The nodes are mutable and the graph structure itself, once created, is constant.
I can mechanically change class Value to:
struct Value {
var id: Int
var data: Double {
get { allValues[id].data }
nonmutating set { allValues[id].data = newValue }
}
...
}
struct ValueContent {
var data: Double
... a few other fields
}
var allValues: [ValueContent] = [ ] // values are allocated from here
But this feels like I've just reinvented reference types.
Is there a good example of a neural network (or just a graph implementation) made out of value types?
Or are value types just not a good fit for this task and I should keep using classes?
Just in case, did you read about adjacency array (or adjacency matrix)? I didn't use it myself, but it's often mentioned as a way to implement graph using value types. Tera's path approach is good for his specific problem domain, but it isn't as general as adjacency array. I use UUID in my app (it's a financial app and data references to each other. The relation is an ad-hoc graph). It may sound awkward to use value type in this case, but I somehow worked out a framework specific to my app and it worked well.
One common issue in value type programming is how to access (including modifying) them. What I learned from my experience is that it usually requires a framework suitable to your problem domain. This might not be obvious to people coming from OOP, because it's an non-issue in reference type.
Take SwiftUI as an example, Apple engineers introduced a lot of SwiftUI specific mechanisms to access/modify value, like @State, @Environment. We need to work out our own mechanism to address similar issues when using value type. Our approaches might not be as elegant as Apple's, but the point is it's a common issue one need to address in value type programming.
Tried a few options with classes and values, got these timing results:
Ref: 92ms
Val + Array: 617ms
Val + malloc: 96ms
In this test a graph of about 2M nodes is traversed a few times during which some nodes are changed based on other nodes. The node is POD struct of 80 bytes (or a class holding that struct).
"Ref" - a version that uses classes
"Val + Array" - a version that uses a minimal struct with integer ID and the data itself resides in a side "allValues" array.
"Val + Array" - ditto but this time the data resides in a malloced buffer
The amount of overhead of array subscript operations (the measured section didn't contain array insertion or deletion, only getting and setting array elements) is quite high (FWIW, there were 836 million getters, and 145 million setters called during this test, in addition to some 200 million math operations).
Interestingly the class version is still a bit faster than the Val+malloc version, despite of anticipated ARC overhead.
As a side note: deterministic random number generator helped tremendously during testing / refactoring.
the algorithm depends on random numbers and to get consistent results across app runs (e.g. to check that bugs were not introduced during refactoring or in a different version of the algorithm) having deterministic / seedable random number generator is super important. For that I'm using drand48.
The app could have benefited from "shuffle" call but that's based on Swift random which doesn't give a deterministic sequence, so I had to reimplement shuffle myself.
Your suggestion about run-time exclusivity checks is spot-on: if I disable it – Array subscripts are significantly faster (still slower than unsafe pointer subscripts, but that's probably expectable, no?).
This is a much simpler version that models what the app is doing. It has about the same speed difference between Array and malloc version (a bit higher difference because it is doing so little per array / buffer access compared to the whole app):
// Release build, exclusive access to memory checks: on
// 3263235 µs - Array version
// 417975 µs - malloc version
// 7.8 times slower
// Release build, exclusive access to memory checks: off
// 972422 µs - Array version
// 401312 µs - malloc version
// 2.4 times slower
import Foundation
let n = 1 << 21 // ~ 2 million
let mask = (1 << 21) - 1
var array = [Double](repeating: 0, count: n)
var buffer = malloc(n * MemoryLayout<Double>.stride).assumingMemoryBound(to: Double.self)
func initBuffers() {
for i in 0 ..< n {
array[i] = Double(i)
buffer[i] = Double(i)
}
}
struct Value2 {
subscript(array index: Int) -> Double {
get { array[index] }
nonmutating set { array[index] = newValue }
}
subscript(buffer index: Int) -> Double {
get { buffer[index] }
nonmutating set { buffer[index] = newValue }
}
}
class Test {
init() {
initBuffers()
}
func test() {
let mask = n - 1
var sum = 0.0
let v = Value2()
var k = 0
let start1 = Date()
for _ in 0 ..< 100 {
for i in 0 ..< n {
let index = (i * k) & mask
let val = v[array: index]
v[array: index] = val * 1.0001
k += 1
sum += val
}
}
let elapsed1 = Date().timeIntervalSince(start1)
let start2 = Date()
for _ in 0 ..< 100 {
for i in 0 ..< n {
let index = (i * k) & mask
let val = v[buffer: index]
v[buffer: index] = val * 1.0001
k += 1
sum += val
}
}
let elapsed2 = Date().timeIntervalSince(start2)
print(sum)
print("\(Int(elapsed1 * 1000_000)) µs")
print("\(Int(elapsed2 * 1000_000)) µs")
print("\(elapsed1/elapsed2) times slower")
print()
}
}
let test = Test()
test.test()