Chaining struct-mutating funcs


(Fritz Anderson) #1

Swift 3 as of Xcode 8.0b4

TL;DR: I have a struct value type backed by a copy-on-write mutable buffer. You use it to perform arithmetic on the buffers. The most expressive way to do this efficiently is to chain the arithmetic operators so each mutates the same buffer. Swift does not like to chain mutating operators — it treats the result of each step as immutable, so you can’t continue the chain. I can’t argue; the syntax apparently can't express anything else.

All the alternatives I see are ugly-to-dangerous.

Have I missed something, I hope? Please make a fool of me.

  — F

The details of my use case or implementation are off-topic; even if mine are ill-considered, surely apt ones exist. Unless you can demonstrate there are none.

The vDSP_* functions in Apple’s Accelerate framework are declared in C to operate on naked float or double pointers. I decided to represent such Float buffers in Swift by a struct (call it ManagedFloatBuffer) containing a reference to a FloatBuffer, which is a final specialization of class ManagedBuffer<Int, Float>.

(The names are a work-in-progress. Just remember: ManagedFloatBuffer is a value type that can copy-on-write to a reference to FloatBuffer, a backing store for a bunch of Floats.)

The nonmutating funcs:

    func subtract(_ other: ManagedFloatBuffer) -> ManagedFloatBuffer
    func subtract(_ scalar: Float) -> ManagedFloatBuffer

are straightforward. They return new ManagedFloatBuffer values. You can chain further calls to simplify a complex calculation that is neither intricate nor tied up in temporaries:

    let sum²OfResiduals = speeds
        .subtract(cameraSpeed.mean)
        .multiply(feetToMeters)
        .sumOfSquares

Great. And vDSP gets you about a 40% boost. (The compiler itself seems to do a pretty good job of auto-vectorizing; the unoptimized code is a couple of orders of magnitude slower.) But as you chain the immutables, you generate new FloatBuffers to hold the intermediate results. For long chains, you end up allocating new buffers (which turns out to be expensive on the time scale of vectorized math) and copying large buffers into them that you are about to discard. I want my Swift code to be as performant as C, but safer and more expressive.

So how about some mutating functions to change a ManagedFloatBuffer’s bytes in-place (copying-on-write as needed so you can preserve intermediate values)?

    mutating func reduce(by other: ManagedFloatBuffer) -> ManagedFloatBuffer
    mutating func reduce(by scalar: Float) -> ManagedFloatBuffer

These return self, because I’d hoped I could chain operators as I did with the non-mutating versions.

The compiler doesn’t like this. It says reduce(by:) returns an immutable value, so you can’t chain mutating functions.

(I can see an issue in that when the first func's self is copied as the return value that is used as the second func’s self, that could make two surviving references to the same buffer, so a buffer copy would happen when you mutate the second func’s self anyway. I’m not sure the compiler has to do that, but I can see how it might be hard to account for otherwise. Hey, it’s a tail call, right? SMOP, not source-breaking at all.)

StackOverflow invites me to eat cake: Make the mutable operand inout to funcs I call one by one. Something like:

    multiply(perspectiveCorrections, into: &pixelXes)
    sin(of: &pixelXes)
    multiply(pixelXes, into: &speeds)
    multiply(feetToMeters, into: &speeds)
    subtract(cameraSpeed.mean, from: &speeds)
    let sumSquaredOfResiduals = speeds.sumOfSquares

    // grodiness deliberately enhanced for illustration

I’d rather not. The thing to be calculated is named at the bottom of the paragraph. The intermediate steps must preserve names that change meaning line-by-line. You have to study the code to recognize it as a single arithmetic expression.

And by the by, if a vector operand is itself the result of a mutating operation, the dependency graph becomes a nightmare to read — I can’t be sure the illustration even expresses a plausible calculation.

Thinking up more reasons to hate this solution is a fun parlor game you and your family can play at home.

Strictly speaking, the compiler is right: I don’t see any language construct that expresses that a returned value type that may be mutated by a chained func. Am I correct?

I’m not at all happy with turning ManagedFloatBuffer into a class. Intuitively, this is a value type. Passing a packet of Floats into a func (or into another thread, as one does with math) and finding your Floats had changed in the mean time is… surprising.

I’m not optimistic, but I have to ask: Is there a way to do this — to take mutability down an operator chain?

  — F


(Joe Groff) #2

Since your backing buffer is copy-on-write, you can do the in-place mutation optimization in your immutable implementations, something like this:

class C {
  var value: Int
  init(value: Int) { self.value = value }
}

struct S { var c: C }

func addInts(x: S, y: S) -> S {
  var tmp = x
  // Don't use x after this point so that it gets forwarded into tmp
  if isKnownUniquelyReferenced(&tmp.c) {
    tmp.c.value += y.c.value
    return tmp
  } else {
    return S(c: C(value: tmp.c.value + y.c.value))
  }
}

which should let you get similar efficiency to the mutating formulation while using semantically immutable values.

-Joe

···

On Aug 5, 2016, at 2:35 PM, Fritz Anderson via swift-users <swift-users@swift.org> wrote:

Swift 3 as of Xcode 8.0b4

TL;DR: I have a struct value type backed by a copy-on-write mutable buffer. You use it to perform arithmetic on the buffers. The most expressive way to do this efficiently is to chain the arithmetic operators so each mutates the same buffer. Swift does not like to chain mutating operators — it treats the result of each step as immutable, so you can’t continue the chain. I can’t argue; the syntax apparently can't express anything else.

All the alternatives I see are ugly-to-dangerous.

Have I missed something, I hope? Please make a fool of me.

  — F

The details of my use case or implementation are off-topic; even if mine are ill-considered, surely apt ones exist. Unless you can demonstrate there are none.

The vDSP_* functions in Apple’s Accelerate framework are declared in C to operate on naked float or double pointers. I decided to represent such Float buffers in Swift by a struct (call it ManagedFloatBuffer) containing a reference to a FloatBuffer, which is a final specialization of class ManagedBuffer<Int, Float>.

(The names are a work-in-progress. Just remember: ManagedFloatBuffer is a value type that can copy-on-write to a reference to FloatBuffer, a backing store for a bunch of Floats.)

The nonmutating funcs:

    func subtract(_ other: ManagedFloatBuffer) -> ManagedFloatBuffer
    func subtract(_ scalar: Float) -> ManagedFloatBuffer

are straightforward. They return new ManagedFloatBuffer values. You can chain further calls to simplify a complex calculation that is neither intricate nor tied up in temporaries:

    let sum²OfResiduals = speeds
        .subtract(cameraSpeed.mean)
        .multiply(feetToMeters)
        .sumOfSquares

Great. And vDSP gets you about a 40% boost. (The compiler itself seems to do a pretty good job of auto-vectorizing; the unoptimized code is a couple of orders of magnitude slower.) But as you chain the immutables, you generate new FloatBuffers to hold the intermediate results. For long chains, you end up allocating new buffers (which turns out to be expensive on the time scale of vectorized math) and copying large buffers into them that you are about to discard. I want my Swift code to be as performant as C, but safer and more expressive.

So how about some mutating functions to change a ManagedFloatBuffer’s bytes in-place (copying-on-write as needed so you can preserve intermediate values)?

    mutating func reduce(by other: ManagedFloatBuffer) -> ManagedFloatBuffer
    mutating func reduce(by scalar: Float) -> ManagedFloatBuffer

These return self, because I’d hoped I could chain operators as I did with the non-mutating versions.

The compiler doesn’t like this. It says reduce(by:) returns an immutable value, so you can’t chain mutating functions.

(I can see an issue in that when the first func's self is copied as the return value that is used as the second func’s self, that could make two surviving references to the same buffer, so a buffer copy would happen when you mutate the second func’s self anyway. I’m not sure the compiler has to do that, but I can see how it might be hard to account for otherwise. Hey, it’s a tail call, right? SMOP, not source-breaking at all.)

StackOverflow invites me to eat cake: Make the mutable operand inout to funcs I call one by one. Something like:

    multiply(perspectiveCorrections, into: &pixelXes)
    sin(of: &pixelXes)
    multiply(pixelXes, into: &speeds)
    multiply(feetToMeters, into: &speeds)
    subtract(cameraSpeed.mean, from: &speeds)
    let sumSquaredOfResiduals = speeds.sumOfSquares

    // grodiness deliberately enhanced for illustration

I’d rather not. The thing to be calculated is named at the bottom of the paragraph. The intermediate steps must preserve names that change meaning line-by-line. You have to study the code to recognize it as a single arithmetic expression.

And by the by, if a vector operand is itself the result of a mutating operation, the dependency graph becomes a nightmare to read — I can’t be sure the illustration even expresses a plausible calculation.

Thinking up more reasons to hate this solution is a fun parlor game you and your family can play at home.

Strictly speaking, the compiler is right: I don’t see any language construct that expresses that a returned value type that may be mutated by a chained func. Am I correct?

I’m not at all happy with turning ManagedFloatBuffer into a class. Intuitively, this is a value type. Passing a packet of Floats into a func (or into another thread, as one does with math) and finding your Floats had changed in the mean time is… surprising.

I’m not optimistic, but I have to ask: Is there a way to do this — to take mutability down an operator chain?

  — F

_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


(Dave Abrahams) #3

Yep, that works. The only other trick I know of is to create a
composition of operations that you apply at the end with an operator:

    pixelXes <- multiply(perspectiveCorrections).sin
    speeds <- multiply(pixelXes)
              .multiply(feetToMeters)
              .subtract(cameraSpeed.mean)
    let sumSquaredOfResiduals = speeds.sumOfSquares

HTH,

···

on Fri Aug 05 2016, Joe Groff <swift-users-AT-swift.org> wrote:

Since your backing buffer is copy-on-write, you can do the in-place
mutation optimization in your immutable implementations, something
like this:

class C {
  var value: Int
  init(value: Int) { self.value = value }
}

struct S { var c: C }

func addInts(x: S, y: S) -> S {
  var tmp = x
  // Don't use x after this point so that it gets forwarded into tmp
  if isKnownUniquelyReferenced(&tmp.c) {
    tmp.c.value += y.c.value
    return tmp
  } else {
    return S(c: C(value: tmp.c.value + y.c.value))
  }
}

which should let you get similar efficiency to the mutating
formulation while using semantically immutable values.

--
-Dave


(Fritz Anderson) #4

I’m sending directly to those who took time over my question, because, per Michael’s request, I have a minimal case to attach. Phrases in boldface are for skimmability, not shouting.

Strip out my use case (I’m encouraged that Dave recapitulated exactly what I was asking about). My remaining question is: How do you safely reuse a class reference as the backing store for a struct? Yes, it’s done all the time, but the trick requirement is that I want to chain the funcs that do it; chained value returns are immutable; and as far as I could tell, you can’t get CoW-safe reuse without mutating self. The result is that you are forced to make expensive copies of the backing store every time. Joe’s solution looked promising in that it purported to pass the CoW buffer (if possible) out of the func after doing all the mutation internally.

I couldn’t figure out how the answer Joe gave could work: His code duplicates the struct’s original reference into a copy of the struct, after which he expects the runtime to report (when possible) that no such duplicate exists. I asked how this could be, as my attempt to replicate in a playground showed the reference was never found unique — which is what I had intuitively expected.

He says this fundamental design pattern in Swift works only if you change the semantics of the language by turning on the optimizer. (Sorry to be all Asperger's about it, but nobody corrected me the first time I put it this way.)

I have many, many questions, I might even hazard objections, but they’re moot: Optimized or not, that code never reuses the backing object.

The attached project was built with Xcode 8.0b5. It uses Joe’s code (except I still must use isUniquelyReferencedNonObjC(_:)). I run addInts(x:, y:) and check two ways whether the reference was found unique. Same result, optimized or not.

It occurred to me that the globals s_x and s_y might bump the reference count. I removed them and used this instead:

let s_result = addInts(x: S(c: C(value: 99)), y: S(c: C(value: -98)))

Still no unique references.

I recognize I am taking up a lot of god time mere days before a major release, when the likeliest outcome is that I’m a jackass. My concern runs deeper than what’s in this message, but I shouldn't muddy the waters. Those deeper things can go back to the public once I understand the issue better (or you boot me back).

structs-and-refs.zip (32.5 KB)

ManagedFloatBuffer.swift (23.2 KB)

···

---

Context (supplementary, no need to spend time)

If seeing what I’m trying to do helps, I’ve attached my attempt. I’m sure there are defects in API style, safety, and correctness. I’d have done more if I hadn’t suspended the project over this issue. Class FloatBuffer is the backing store; ManagedFloatBuffer is the wrapper class that does all the operations. None of the operations are declared mutating.

Anything that calls (unary|binary)Operator returns a fresh ManagedFloatBuffer every time; structs and their buffers are created and initialized every time — intentionally. It seems to work well under gentle use.

Callers of mutabilityWrapper preserve the receiver’s backing buffer whenever possible and mutate it in-place. Those funcs always return self. These might be a big win; I can’t tell because I’ve never been able to get unique references. Because there’s an allocation (possibly initialization) every time, they are no better than the (unary|binary)Operator funcs. malloc and friends take up a significant amount of time on the scale of vector math.

Both flavors can cascade; the only problem is that the “in-place” methods don’t live up to the name.

  — F


(Dave Abrahams) #5

I’m sending directly to those who took time over my question, because, per Michael’s request, I have a
minimal case to attach. Phrases in boldface are for skimmability, not shouting.

Strip out my use case (I’m encouraged that Dave recapitulated exactly
what I was asking about). My remaining question is: How do you safely
reuse a class reference as the backing store for a struct? Yes, it’s
done all the time, but the trick requirement is that I want to chain
the funcs that do it; chained value returns are immutable; and as far
as I could tell, you can’t get CoW-safe reuse without mutating
self. The result is that you are forced to make expensive copies of
the backing store every time. Joe’s solution looked promising in that
it purported to pass the CoW buffer (if possible) out of the func
after doing all the mutation internally.

I couldn’t figure out how the answer Joe gave could work: His code
duplicates the struct’s original reference into a copy of the struct,
after which he expects the runtime to report (when possible) that no
such duplicate exists. I asked how this could be, as my attempt to
replicate in a playground showed the reference was never found unique
— which is what I had intuitively expected.

Well, a playground is a terrible way to check this, since lifetimes
don't necessarily obey the normal rules, but... you're right, it doesn't
seem to be workable today.

He says this fundamental design pattern in Swift works only if you
change the semantics of the language by turning on the
optimizer.

When you count performance characteristics as semantics, yes.

(Sorry to be all Asperger's about it, but nobody corrected me the
first time I put it this way.)

Actually I can't even get it to happen with the optimizer on in my
tests.

I have many, many questions, I might even hazard objections, but
they’re moot: Optimized or not, that code never reuses the backing
object.

The attached project was built with Xcode 8.0b5. It uses Joe’s code
(except I still must use isUniquelyReferencedNonObjC(_:)). I run
addInts(x:, y:) and check two ways whether the reference was found
unique. Same result, optimized or not.

It occurred to me that the globals s_x and s_y might bump the reference count. I removed them and used this
instead:

let s_result = addInts(x: S(c: C(value: 99)), y: S(c: C(value: -98)))

Still no unique references.

I recognize I am taking up a lot of god time mere days before a major
release, when the likeliest outcome is that I’m a jackass.

I don't think so. What's more likely is that we could be optimizing
better.

My concern runs deeper than what’s in this message, but I shouldn't
muddy the waters. Those deeper things can go back to the public once
I understand the issue better (or you boot me back).

---

Context (supplementary, no need to spend time)

If seeing what I’m trying to do helps, I’ve attached my attempt. I’m
sure there are defects in API style, safety, and correctness. I’d have
done more if I hadn’t suspended the project over this issue. Class
FloatBuffer is the backing store; ManagedFloatBuffer is the wrapper
class that does all the operations. None of the operations are
declared mutating.

Anything that calls (unary|binary)Operator returns a fresh
ManagedFloatBuffer every time; structs and their buffers are created
and initialized every time — intentionally. It seems to work well
under gentle use.

Callers of mutabilityWrapper preserve the receiver’s backing buffer
whenever possible and mutate it in-place. Those funcs always return
self. These might be a big win; I can’t tell because I’ve never been
able to get unique references. Because there’s an allocation (possibly
initialization) every time, they are no better than the
(unary|binary)Operator funcs. malloc and friends take up a significant
amount of time on the scale of vector math.

Both flavors can cascade; the only problem is that the “in-place”
methods don’t live up to the name.

The question of how to optimize the evaluation of non-mutating
expressions using in-place operations is an old one. Once upon a time
we were pursuing language features for that purpose
(https://github.com/apple/swift/blob/master/docs/proposals/Inplace.rst)
but that particular approach is... an ex-approach ;-).

···

on Tue Aug 09 2016, Fritz Anderson <swift-users-AT-swift.org> wrote:

--
-Dave