Functions that accept arrays or array slices

Yes, that why I did it.

I have managed to get working code based on your input.

// Returns a tuple containing the high and low components to UInt addition
 @inline(__always)
func addingFullWidth(_ x: UInt,_ y: UInt) -> (high: UInt, low: UInt) {
  let sum = x.addingReportingOverflow(y)
  return (sum.overflow ? 1 : 0, sum.partialValue)
}

extension RandomAccessCollection
where Self: MutableCollection,
      Self: RangeReplaceableCollection,
      Element == UInt,
      Index == Int {
    
    // in-place add rhs to self
    mutating func Add<R: RandomAccessCollection> (_ rhs: R,_ uselastcarry:Bool = true)
    where R.Element == UInt, R.Index == Int {
        if self.count == 0 || (self.count == 1 && self.startIndex == 0) {
            self = Self(Array(rhs))
            return
        }
        if rhs.count == 0 || (rhs.count == 1 && rhs.startIndex == 0) {
            self = Self(Array(self))
            return
        }
        let (sc, rc) = (self.count, rhs.count)
        let (si, ri) = (self.startIndex, rhs.startIndex)
        let minc = Swift.min(sc, rc)
        var rIndex = 0
        var carry:UInt = 0
        if sc >= rc {
            while rIndex < minc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], rhs[rIndex + ri], carry)
                rIndex += 1
            }
            if sc == rc {
                if carry > 0 && uselastcarry { self.append(1)}
                return
            }
            while carry > 0 && rIndex < rc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { self.append(1)}
            self = Self(Array(self))
            return
        } else {
            var result = Array(rhs)
            while rIndex < minc {
                (carry, result[rIndex]) = addingFullWidth(result[rIndex], self[rIndex + si], carry)
                rIndex += 1
            }
            while carry > 0 && rIndex < rc {
                (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { result.append(1)}
            self = Self(result)
            return
        }
    }
}

var u1:[UInt] = [0]
var u2 = u1[1...]
u2.Add(u1)
print(u2)

For simplicity I chose to output self as an array and made the conversions necessary. I could have made the necessary conversions to make self's type remain unchanged but I would have to go this route I think:

if self is Array<UInt> {
    //  do what's necessary to make self an array
} else {
   //  do what's necessary to make self  a slice
}

I am going to have to ponder the output some. To this point I was trying to solve having two different inputs without thinking the output (sigh, two more options). This just gets worse and worse. No wonder I see developers just avoiding dealing with slices and just always converting to array.

The only thing I have not been able to solve is how to get a pointer to self or rhs in the add function. For sure if self is a slice the compiler simply refuses to cooperate no matter what I tried. The only sure way I know is to pass the pointer as an argument as in previous posts. However that complicates self even more.

Anyhow, I hope this post helps others that find themselves trying to merge array and array.

Next up is to do some benchmarking.

what about a += CollectionOfOne.init(b)?

You could eliminate the requirement that Index == Int if you use index(_, offsetBy:).

I have some initial benchmarking for mutating vs non-mutating versions. Code is below:


// Returns a tuple containing the high and low components to UInt addition
 @inline(__always)
func addingFullWidth(_ x: UInt,_ y: UInt) -> (high: UInt, low: UInt) {
  let sum = x.addingReportingOverflow(y)
  return (sum.overflow ? 1 : 0, sum.partialValue)
}

extension RandomAccessCollection
where Self: MutableCollection,
      Self: RangeReplaceableCollection,
      Element == UInt,
      Index == Int {
    
    // in-place add rhs to self
    mutating func Add<R: RandomAccessCollection> (_ rhs: R,_ uselastcarry:Bool = true)
    where R.Element == UInt, R.Index == Int {
        if self.count == 0 || (self.count == 1 && self.startIndex == 0) {
            self = Self(Array(rhs))
            return
        }
        if rhs.count == 0 || (rhs.count == 1 && rhs.startIndex == 0) {
            self = Self(Array(self))
            return
        }
        let (sc, rc) = (self.count, rhs.count)
        let (si, ri) = (self.startIndex, rhs.startIndex)
        let minc = Swift.min(sc, rc)
        var rIndex = 0
        var carry:UInt = 0
        if sc >= rc {
            while rIndex < minc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], rhs[rIndex + ri], carry)
                rIndex += 1
            }
            if sc == rc {
                if carry > 0 && uselastcarry { self.append(1)}
                return
            }
            while carry > 0 && rIndex < rc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { self.append(1)}
            self = Self(Array(self))
            return
        } else {
            var result = Array(rhs)
            while rIndex < minc {
                (carry, result[rIndex]) = addingFullWidth(result[rIndex], self[rIndex + si], carry)
                rIndex += 1
            }
            while carry > 0 && rIndex < rc {
                (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { result.append(1)}
            self = Self(result)
            return
        }
    }
}

// add two [UInt] arrays
func array64Add<C1: RandomAccessCollection, C2: RandomAccessCollection>(_ x: C1,_ xPtr: UnsafeMutablePointer<UInt>,_ y: C2,_ yPtr: UnsafeMutablePointer<UInt>,_ uselastcarry:Bool = true) -> [UInt]
    where C1.Element == UInt, C1.Index == Int, C2.Element == UInt, C2.Index == Int {
    if x.count == 0 || (x.count == 1 && x.startIndex == 0) { return Array(y)}
    if y.count == 0 || (y.count == 1 && y.startIndex == 0) { return Array(x)}
    let (xc, yc) = (x.count, y.count)
    let minc = Swift.min(xc, yc)
    var rIndex = 0
    var result:[UInt]
    var carry:UInt = 0
    if xc >= yc {
        result = Array(x)
        while rIndex < minc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], yPtr[rIndex], carry)
            rIndex += 1
        }
        if xc == yc {
            if carry > 0 && uselastcarry { result.append(1)}
            return result
        }
        while carry > 0 && rIndex < xc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
            rIndex += 1
        }
        if carry > 0 && uselastcarry { result.append(1)}
    } else {
        result = Array(y)
        while rIndex < minc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], xPtr[rIndex], carry)
            rIndex += 1
        }
        while carry > 0 && rIndex < yc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
            rIndex += 1
        }
        if carry > 0 && uselastcarry { result.append(1)}
    }
    return result
}


public func benchmarkTime(_ iter: Int, f:()->()) -> Double {
    f() // warmup
    let startTime = CFAbsoluteTimeGetCurrent()
    for _ in 0..<iter {f()}
    let elapsedTime = CFAbsoluteTimeGetCurrent() - startTime
    return elapsedTime
}

public func benchmark(title:String,_ iter: Int, f:()->()) {
    let averageTime = benchmarkTime(iter, f:f) / Double(iter)
    if averageTime >= 1.0 {
        print("\(title): \(String(averageTime)) seconds")
    } else if averageTime >= 0.001 {
        print("\(title): \(String(averageTime * 1000.0)) milliseconds")
    } else {
        print("\(title): \(String(averageTime * 1000000.0)) microseconds")
    }
}

for wordLength in [10, 1000, 100000, 1000000] {
    print("wordLength =", wordLength)
    var u1:[UInt] = Array(repeating: UInt.max, count: wordLength)
    var u2 = u1[...]
    benchmark(title: "nun-mutating add", 100) {
        let u3 = array64Add(u1, &u1[u1.startIndex], u2, &u2[u2.startIndex])
    }
    
    benchmark(title: "nun-mutating add reversed", 100) {
        let u3 = array64Add(u2, &u2[u2.startIndex], u1, &u1[u1.startIndex])
    }
    
    u1 = Array(repeating: UInt.max, count: wordLength)
    u2 = u1[...]
    benchmark(title: "mutating add", 100) {
        u1.Add(u2)
    }
    
    u1 = Array(repeating: UInt.max, count: wordLength)
    u2 = u1[...]
    benchmark(title: "mutating add reversed", 100) {
        u2.Add(u1)
    }
    print(" ")
}

Benchmarks performed on 2013 MacBook pro with code in release mode with safety checks disabled and optimized for speed. Results below:

wordLength = 10
nun-mutating add: 0.9202957153320312 microseconds
nun-mutating add reversed: 1.0395050048828125 microseconds
mutating add: 4.9495697021484375 microseconds
mutating add reversed: 4.409551620483398 microseconds
 
wordLength = 1000
nun-mutating add: 6.110668182373047 microseconds
nun-mutating add reversed: 5.890130996704102 microseconds
mutating add: 444.4706439971924 microseconds
mutating add reversed: 441.7300224304199 microseconds
 
wordLength = 100000
nun-mutating add: 557.7099323272705 microseconds
nun-mutating add reversed: 562.0193481445312 microseconds
mutating add: 40.66767930984497 milliseconds
mutating add reversed: 39.13396954536438 milliseconds
 
wordLength = 1000000
nun-mutating add: 7.4018096923828125 milliseconds
nun-mutating add reversed: 7.455489635467529 milliseconds
mutating add: 411.7432498931885 milliseconds
mutating add reversed: 409.24807071685

Quite a dramatic difference. I thought it would be much closer.

Sorry to sound like a stuck record but this still isn't in-place mutation. You've just created a version of the out-of-place function that, once it's done, replaces self with the new Array.

1 Like

No you're right, half of the function did not. I made some changes which should qualify as fully mutating.

// Returns a tuple containing the high and low components to UInt addition
 @inline(__always)
func addingFullWidth(_ x: UInt,_ y: UInt) -> (high: UInt, low: UInt) {
  let sum = x.addingReportingOverflow(y)
  return (sum.overflow ? 1 : 0, sum.partialValue)
}

extension RandomAccessCollection
where Self: MutableCollection,
      Self: RangeReplaceableCollection,
      Element == UInt,
      Index == Int {
    
    // in-place add rhs to self
    mutating func Add<R: RandomAccessCollection> (_ rhs: R,_ uselastcarry:Bool = true)
    where R.Element == UInt, R.Index == Int {
        if self.count == 0 || (self.count == 1 && self.startIndex == 0) {
            self = Self(Array(rhs))
            return
        }
        if rhs.count == 0 || (rhs.count == 1 && rhs.startIndex == 0) {
            self = Self(Array(self))
            return
        }
        let (sc, rc) = (self.count, rhs.count)
        let (si, ri) = (self.startIndex, rhs.startIndex)
        let minc = Swift.min(sc, rc)
        var rIndex = 0
        var carry:UInt = 0
        if sc >= rc {
            while rIndex < minc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], rhs[rIndex + ri], carry)
                rIndex += 1
            }
            if sc == rc {
                if carry > 0 && uselastcarry { self.append(1)}
                return
            }
            while carry > 0 && rIndex < rc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { self.append(1)}
            self = Self(Array(self))
            return
        } else {
            while rIndex < minc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], rhs[rIndex + ri], carry)
                rIndex += 1
            }
            self = self + rhs[minc...]
            while carry > 0 && rIndex < rc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { self.append(1)}
            self = Self(Array(self))
            return
        }
    }
}

// add two [UInt] arrays
func array64Add<C1: RandomAccessCollection, C2: RandomAccessCollection>(_ x: C1,_ xPtr: UnsafeMutablePointer<UInt>,_ y: C2,_ yPtr: UnsafeMutablePointer<UInt>,_ uselastcarry:Bool = true) -> [UInt]
    where C1.Element == UInt, C1.Index == Int, C2.Element == UInt, C2.Index == Int {
    if x.count == 0 || (x.count == 1 && x.startIndex == 0) { return Array(y)}
    if y.count == 0 || (y.count == 1 && y.startIndex == 0) { return Array(x)}
    let (xc, yc) = (x.count, y.count)
    let minc = Swift.min(xc, yc)
    var rIndex = 0
    var result:[UInt]
    var carry:UInt = 0
    if xc >= yc {
        result = Array(x)
        while rIndex < minc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], yPtr[rIndex], carry)
            rIndex += 1
        }
        if xc == yc {
            if carry > 0 && uselastcarry { result.append(1)}
            return result
        }
        while carry > 0 && rIndex < xc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
            rIndex += 1
        }
        if carry > 0 && uselastcarry { result.append(1)}
    } else {
        result = Array(y)
        while rIndex < minc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], xPtr[rIndex], carry)
            rIndex += 1
        }
        while carry > 0 && rIndex < yc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
            rIndex += 1
        }
        if carry > 0 && uselastcarry { result.append(1)}
    }
    return result
}

public func benchmarkTime(_ iter: Int, f:()->()) -> Double {
    f() // warmup
    let startTime = CFAbsoluteTimeGetCurrent()
    for _ in 0..<iter {f()}
    let elapsedTime = CFAbsoluteTimeGetCurrent() - startTime
    return elapsedTime
}

public func benchmark(title:String,_ iter: Int, f:()->()) {
    let averageTime = benchmarkTime(iter, f:f) / Double(iter)
    if averageTime >= 1.0 {
        print("\(title): \(String(averageTime)) seconds")
    } else if averageTime >= 0.001 {
        print("\(title): \(String(averageTime * 1000.0)) milliseconds")
    } else {
        print("\(title): \(String(averageTime * 1000000.0)) microseconds")
    }
}

for wordLength in [4, 10, 1000, 100000, 1000000] {
    print("wordLength =", wordLength)
    var u1:[UInt] = Array(repeating: UInt.max, count: wordLength)
    var u2 = u1[...]
    benchmark(title: "nun-mutating add", 100) {
        let u3 = array64Add(u1, &u1[u1.startIndex], u2, &u2[u2.startIndex])
    }
    
    benchmark(title: "nun-mutating add reversed", 100) {
        let u3 = array64Add(u2, &u2[u2.startIndex], u1, &u1[u1.startIndex])
    }
    
    u1 = Array(repeating: UInt.max, count: wordLength)
    u2 = u1[...]
    benchmark(title: "mutating add", 100) {
        u1.Add(u2)
    }
    
    u1 = Array(repeating: UInt.max, count: wordLength)
    u2 = u1[...]
    benchmark(title: "mutating add reversed", 100) {
        u2.Add(u1)
    }
    print(" ")
}


New benchmarks below with similar results.

wordLength = 4
nun-mutating add: 0.9500980377197266 microseconds
nun-mutating add reversed: 1.0097026824951172 microseconds
mutating add: 2.1898746490478516 microseconds
mutating add reversed: 2.25067138671875 microseconds
 
wordLength = 10
nun-mutating add: 1.119375228881836 microseconds
nun-mutating add reversed: 1.329183578491211 microseconds
mutating add: 4.819631576538086 microseconds
mutating add reversed: 4.649162292480469 microseconds
 
wordLength = 1000
nun-mutating add: 5.730390548706055 microseconds
nun-mutating add reversed: 5.689859390258789 microseconds
mutating add: 458.0700397491455 microseconds
mutating add reversed: 473.87003898620605 microseconds
 
wordLength = 100000
nun-mutating add: 583.1098556518555 microseconds
nun-mutating add reversed: 607.140064239502 microseconds
mutating add: 42.97057032585144 milliseconds
mutating add reversed: 43.00993084907532 milliseconds
 
wordLength = 1000000
nun-mutating add: 7.75020956993103 milliseconds
nun-mutating add reversed: 7.6921093463897705 milliseconds
mutating add: 435.51946997642517 milliseconds
mutating add reversed: 428.4147894382477 milliseconds

This still isn't in-place mutation. The goal is not to write a mutating function – the reason for the mutating version is to avoid allocating new storage, and instead re-use the existing storage. But in your case, you are still allocating a new array and assigning it to self.

That said, there are probably other things at play here as well as that. But it's difficult to experiment with the code because it isn't stand-alone. It looks like it's relying on a 3-argument version of addingFullWidth that takes carry. I'm guessing this is somewhere else in your project, but not posted here?

1 Like

Sorry, Ben, you are right I did miss this function.

// Returns the high and low parts of two seqeuential potentially overflowing additions
 @inline(__always)
func addingFullWidth(_ x: UInt, _ y: UInt, _ z: UInt) -> (high: UInt, low: UInt) {
  let xy = x.addingReportingOverflow(y)
  let xyz = xy.partialValue.addingReportingOverflow(z)
    let high:UInt = (xy.overflow ? 1 : 0) + (xyz.overflow ? 1 : 0)
  return (high, xyz.partialValue)
}

I am sorry but I'm not sure I understand

self = Self(Array(self))

is to ensure that self is an array versus array slice on output

 self = self + rhs[minc...]

is needed if self is the smaller of the two operands.

self = Self(Array(rhs))

is the case of self = 0 added to rhs where the result will be rhs. Are you suggesting replacing this with removing the one element of self and appending rhs instead?? Or would it be a swap??

Stop trying to do that.

1 Like

OK, I see the disconnect, I was trying to have functions that have one or more arguments that are either array or array slice. In this case, I made the mutating and non-mutating result be an array. But I see for performance comparison sake having the mutating function maintain the type of self (either array or array slice) and as Ben stated to attempt to reuse objects and storage. New code attempt below:

// Returns a tuple containing the high and low components to UInt addition
 @inline(__always)
func addingFullWidth(_ x: UInt,_ y: UInt) -> (high: UInt, low: UInt) {
  let sum = x.addingReportingOverflow(y)
  return (sum.overflow ? 1 : 0, sum.partialValue)
}

// Returns the high and low parts of two seqeuential potentially overflowing additions
 @inline(__always)
func addingFullWidth(_ x: UInt, _ y: UInt, _ z: UInt) -> (high: UInt, low: UInt) {
  let xy = x.addingReportingOverflow(y)
  let xyz = xy.partialValue.addingReportingOverflow(z)
    let high:UInt = (xy.overflow ? 1 : 0) + (xyz.overflow ? 1 : 0)
  return (high, xyz.partialValue)
}

extension RandomAccessCollection
where Self: MutableCollection,
      Self: RangeReplaceableCollection,
      Element == UInt,
      Index == Int {
    
    // in-place add rhs to self
    mutating func Add<R: RandomAccessCollection> (_ rhs: R,_ uselastcarry:Bool = true)
    where R.Element == UInt, R.Index == Int {
        // check for one of the inputs == 0
        if self.count == 0 || (self.count == 1 && self.startIndex == 0) {
            if rhs is Array<UInt> {
                self = Self(rhs[...])
            } else {
                self = Self(rhs)
            }
        }
        if rhs.count == 0 || (rhs.count == 1 && rhs.startIndex == 0) {
            return
        }
        let (sc, rc) = (self.count, rhs.count)
        let (si, ri) = (self.startIndex, rhs.startIndex)
        let minc = Swift.min(sc, rc)
        var rIndex = 0
        var carry:UInt = 0
        if sc >= rc {
            while rIndex < minc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], rhs[rIndex + ri], carry)
                rIndex += 1
            }
            if sc == rc {
                if carry > 0 && uselastcarry { self.append(1)}
                return
            }
            while carry > 0 && rIndex < rc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { self.append(1)}
            return
        } else {
            while rIndex < minc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], rhs[rIndex + ri], carry)
                rIndex += 1
            }
            self = self + rhs[minc...]
            while carry > 0 && rIndex < rc {
                (carry, self[rIndex + si]) = addingFullWidth(self[rIndex + si], carry)
                rIndex += 1
            }
            if carry > 0 && uselastcarry { self.append(1)}
            return
        }
    }
}

// add two [UInt] arrays
func array64Add<C1: RandomAccessCollection, C2: RandomAccessCollection>(_ x: C1,_ xPtr: UnsafeMutablePointer<UInt>,_ y: C2,_ yPtr: UnsafeMutablePointer<UInt>,_ uselastcarry:Bool = true) -> [UInt]
    where C1.Element == UInt, C1.Index == Int, C2.Element == UInt, C2.Index == Int {
    if x.count == 0 || (x.count == 1 && x.startIndex == 0) { return Array(y)}
    if y.count == 0 || (y.count == 1 && y.startIndex == 0) { return Array(x)}
    let (xc, yc) = (x.count, y.count)
    let minc = Swift.min(xc, yc)
    var rIndex = 0
    var result:[UInt]
    var carry:UInt = 0
    if xc >= yc {
        result = Array(x)
        while rIndex < minc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], yPtr[rIndex], carry)
            rIndex += 1
        }
        if xc == yc {
            if carry > 0 && uselastcarry { result.append(1)}
            return result
        }
        while carry > 0 && rIndex < xc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
            rIndex += 1
        }
        if carry > 0 && uselastcarry { result.append(1)}
    } else {
        result = Array(y)
        while rIndex < minc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], xPtr[rIndex], carry)
            rIndex += 1
        }
        while carry > 0 && rIndex < yc {
            (carry, result[rIndex]) = addingFullWidth(result[rIndex], carry)
            rIndex += 1
        }
        if carry > 0 && uselastcarry { result.append(1)}
    }
    return result
}

public func benchmarkTime(_ iter: Int, f:()->()) -> Double {
    f() // warmup
    let startTime = CFAbsoluteTimeGetCurrent()
    for _ in 0..<iter {f()}
    let elapsedTime = CFAbsoluteTimeGetCurrent() - startTime
    return elapsedTime
}

public func benchmark(title:String,_ iter: Int, f:()->()) {
    let averageTime = benchmarkTime(iter, f:f) / Double(iter)
    if averageTime >= 1.0 {
        print("\(title): \(String(averageTime)) seconds")
    } else if averageTime >= 0.001 {
        print("\(title): \(String(averageTime * 1000.0)) milliseconds")
    } else {
        print("\(title): \(String(averageTime * 1000000.0)) microseconds")
    }
}


for wordLength in [4, 10, 1000, 100000, 1000000] {
    print("wordLength =", wordLength)
    var u1:[UInt] = Array(repeating: UInt.max, count: wordLength)
    var u2 = u1[...]
    benchmark(title: "nun-mutating add", 100) {
        let u3 = array64Add(u1, &u1[u1.startIndex], u2, &u2[u2.startIndex])
    }
    
    benchmark(title: "nun-mutating add reversed", 100) {
        let u3 = array64Add(u2, &u2[u2.startIndex], u1, &u1[u1.startIndex])
    }
    
    u1 = Array(repeating: UInt.max, count: wordLength)
    u2 = u1[...]
    benchmark(title: "mutating add", 100) {
        u1.Add(u2)
    }
    
    u1 = Array(repeating: UInt.max, count: wordLength)
    u2 = u1[...]
    benchmark(title: "mutating add reversed", 100) {
        u2.Add(u1)
    }
    print(" ")
}

Re-run of benchmarks below:

wordLength = 4
nun-mutating add: 1.0204315185546875 microseconds
nun-mutating add reversed: 0.9500980377197266 microseconds
mutating add: 1.9192695617675783 microseconds
mutating add reversed: 1.9109249114990237 microseconds
 
wordLength = 10
nun-mutating add: 0.9691715240478517 microseconds
nun-mutating add reversed: 1.02996826171875 microseconds
mutating add: 4.3201446533203125 microseconds
mutating add reversed: 4.290342330932617 microseconds
 
wordLength = 1000
nun-mutating add: 6.030797958374023 microseconds
nun-mutating add reversed: 5.639791488647461 microseconds
mutating add: 451.68042182922363 microseconds
mutating add reversed: 405.71093559265137 microseconds
 
wordLength = 100000
nun-mutating add: 572.3798274993896 microseconds
nun-mutating add reversed: 582.6199054718018 microseconds
mutating add: 42.77442932128906 milliseconds
mutating add reversed: 40.085779428482056 milliseconds
 
wordLength = 1000000
nun-mutating add: 7.653210163116455 milliseconds
nun-mutating add reversed: 7.655730247497559 milliseconds
mutating add: 418.16879987716675 milliseconds
mutating add reversed: 424.2865300178528 milliseconds
 

My apologies for not understanding you, I guess I was being a bit slow.

I haven’t studied your implementation nearly as closely as Ben has, but I’m pretty sure this still isn’t in-place. You’re constructing a new instance of Self (the type) and then replacing self (the variable) with it.

2 Likes

if self == 0 then self + rhs is rhs. The compiler will not allow self = rhs because of the array, slice and generics. In-place replacement is probably not possible since rhs is almost certainly larger than self. Still, I did not have it quite right, I made this modification to ensure that self is returned with the same type. If you have an approach that is conformant to mutating functions let me know.

        // check for one of the inputs == 0
        if self.count == 0 || (self.count == 1 && self.startIndex == 0) {
            if self is Array<UInt> {
                if rhs is Array<UInt> {
                    self = Self(rhs)
                } else {
                    self = Self(Array(rhs))
                }
            } else { // self is array<slice>
                if rhs is Array<UInt> {
                    self = Self(rhs[...])
                } else {
                    self = Self(rhs)
                }
            }

Mathematically, yes. Implementation-wise, what this does is allocate a new instance of Self containing a copy of the contents of rhs. It then re-binds self to this new instance.

You’re not even programming generically, though, because you specifically check for is Array<UInt>. The goal of using generics here is to constrain Self to a common supertype of Array and ArraySlice (and any other collection type) that makes available all the methods you need on both self and rhs.

This is why @Ben_Cohen suggested growing self (or some other scratch space) before entering your inner algorithm:

1 Like

As it happens, if the rhs is empty it will just return the lhs unchanged. Here's the implementation.

You also need the optimizer to do its thing even then. In the general case, it's really not good to rely on this sort of thing.

Here's a sketch of what I'd suggest doing. First, create this function:

extension MutableCollection where Element: BinaryInteger {
    // Function that adds each element of another array to self,
    // carrying overflow to the next place.
    // Does not progress beyond the point of the shorter
    // of the two arrays, and returns an index to that point
    // for both arrays, plus the carry remaining to be added.
    mutating func add<RHS>(_ rhs: RHS) -> (Index,RHS.Index,Element)
    where RHS: Collection, RHS.Element == Element {
        var i = startIndex, j = rhs.startIndex
        var carry: Element = 0
        while i < self.endIndex, j < rhs.endIndex {
            // TODO: carrying version
            self[i] += rhs[j]

            self.formIndex(after: &i)
            rhs.formIndex(after: &j)
        }
        return (i,j,carry)
    }
}

This function has a very simple loop and index advancement logic, that should help the optimizer eliminate any bounds checks when indexing types like Array.

The function only does in-place updates to the left-hand side, no allocations will occur (except when triggering COW). It also does not require self be extended, instead leaving that to the caller.

This has the very important property that you can then use it on UnsafeMutableBufferPointer. That is, instead of this:

var a: [UInt] = [1,2,3]
let b: [UInt] = [4,5,6]

let (i,j,carry) = a.add(b)

you can write this:

a.withUnsafeMutableBufferPointer { a in
    b.withUnsafeBufferPointer { b in
        let (i,j,carry) = a.add(b)
    }
}

This lets you microbenchmark this function against both arrays and unsafe buffer pointers, allowing you to directly observe if the compiler is eliminating enough overhead.

Once you have this function, you can then write the outer function that extends the length of self if necessary:

extension RandomAccessCollection
where Self: MutableCollection,
      Self: RangeReplaceableCollection,
      Element: BinaryInteger
{
    mutating func add<RHS>(_ rhs: RHS)
    where RHS: Collection, RHS.Element == Element {
        var (i,j,carry) = self.add(rhs)

        if i == self.endIndex {
            while j < rhs.endIndex {
                // TODO: handle carry
                self.append(rhs[j])
                rhs.formIndex(after: &j)
            }
        } else {
            while i < endIndex {
                // TODO: handle carry
                formIndex(after: &i)
            }
        }
    }
}

There is probably some huge hole in my logic, this is only a sketch I haven't thought too carefully about, but it should give you a direction to go in.

Finally, once you have that you can write a benchmark that re-uses the same array memory for a over and over again. The entire program should only ever need to allocate memory once up front using Array.reserveCapacity(_:), which should be done outside the benchmark measurement. If memory is already reserved for the array, then appending new elements should be very fast. Remember to always re-use the array in the benchmarks not overwrite it, e.g. by doing

array.removeAll(keepingCapacity: true)
array += newElements

Once you've done all this, if it's still slow, you move to the fun part of filing bugs against the optimizer :)

2 Likes

Oh, also once you get this all working and return to the question of using it on ArraySlice, you will discover the infinite sadness of how mutating array slices in-place tends to hit COW problems in really frustrating ways.

4 Likes

In this case, it’s the lhs which might be empty. So my understanding from the preceding discussion is that the expression self = self + rhs will copy the empty self and grow that copy to contain copies of all the elements of rhs.

Yes, unless the optimizer gets very heroic. It's interesting to consider whether checking if the lhs is empty and branching would be profitable – it'd be a teensy regression except when the lhs is empty, which I suspect is probably rare.

1 Like

In cases like this, where there's a common processing type that the algorithm can work on, such as an UnsafeBufferPointer of an array's contents, generics make for bad tradeoffs—either you're monomorphizing the generic for every conforming type, when the body is ultimately nearly identical for every concrete type, or you don't monomorphize, and you get terrible performance. I would suggest limiting the use of generics for an inlinable "funnel" interface that converts the input to a common processing type, which you pass to a non-inlinable implementation function, something like:

@inlinable
public func foo<C: Collection>(_ c: C) where C.Element == Int {
  ArraySlice(c).withUnsafeBufferPointer { fooImpl($0) }
}

@usableFromInline
internal func fooImpl(_ c: UnsafeBufferPointer<Int>) { ... }
4 Likes

Does ArraySlice(c) create a copy of c’s elements if c is not an Array?

It feels like there’s a tension here between the tendency to traffic in concrete types (Array, UnsafeMutableBufferPointer) and the desire to not have to have foreknowledge of all these concrete types.

I believe it can peep through when the input is already an Array or ArraySlice and copy as is. It would be nice if there were a public withUnsafeBufferPointerIfAvailable requirement on Collection.

I think this is an emergent property of languages that rely on type differences to control data layout differences, like C++, Rust, and Swift all do. You want a currency type for processing the same kind of data (a buffer full of ints, I don't care where it came from as long as it stays alive while I use it), but you want to store the data in different ways (a managed Array, a slice of Data, a memory-mapped file, a fixed-sized array inside another aggregate, and so on). UnsafeBufferPointer is the closest analog Swift has today to C++'s array_view or Rust's &[T].

3 Likes