Improved value and move semantics


(Bram Beernink) #1

Hi all,

Would it be possible to improve value and move semantics (performance) in Swift? Consider this possible Swift code in a future release of Swift:

let array1 : [String] = ["Val1", "Val2"]
let array2 = array1.appended(“Val3”) // Copy of array1 with “Val3” appended. array1 is left untouched. Nothing special yet.
var array3 : [String] = [“Var1”]
array3 = array3.appended(“Var2”) // array3 can just be mutated to add “Var2”, while maintaining value semantics. Swift can recognize that array3’s old state is not referenced anywhere in the future.
let array4 = array2.appended("Val4").appended("Val5") // Copy of array2 with both "Val4" and "Val5" appended. In this case, “Val5” can also be appended by mutation.

This example illustrates improved value semantics with a string array. But it would be good if this can work with any struct. Maybe via something similar to isUniquelyReferenced? Or maybe you can even have a “smart” self in a non-mutating func in a struct:
struct Array<T> {
    func appended(e : T) -> Array<T> { // No mutating keyword!
        self.append(e) // self would either be mutated here if the current ref count of self is 1, and self is either a “rvalue” or self’s old state cannot possibly referenced anymore after this call. Otherwise, "self” would actually be a copy of self.
        return self
    }
}

I think that with such support it is encouraged to make more use of (immutable) value types while not sacrificing performance. Less mutations lead to more understandable code, which leads to less bugs.

In any case, keep up the fast improvements to Swift.

Best regards,
Bram.


(Karl) #2

It’s a known issue. See: https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment

I’m not a compiler expert, but in theory I’d expect a smart-enough, fast-enough optimiser to be able to replace those with mutating calls.

Karl

···

On 29 Jul 2016, at 18:42, Bram Beernink via swift-evolution <swift-evolution@swift.org> wrote:

Hi all,

Would it be possible to improve value and move semantics (performance) in Swift? Consider this possible Swift code in a future release of Swift:

let array1 : [String] = ["Val1", "Val2"]
let array2 = array1.appended(“Val3”) // Copy of array1 with “Val3” appended. array1 is left untouched. Nothing special yet.
var array3 : [String] = [“Var1”]
array3 = array3.appended(“Var2”) // array3 can just be mutated to add “Var2”, while maintaining value semantics. Swift can recognize that array3’s old state is not referenced anywhere in the future.
let array4 = array2.appended("Val4").appended("Val5") // Copy of array2 with both "Val4" and "Val5" appended. In this case, “Val5” can also be appended by mutation.

This example illustrates improved value semantics with a string array. But it would be good if this can work with any struct. Maybe via something similar to isUniquelyReferenced? Or maybe you can even have a “smart” self in a non-mutating func in a struct:
struct Array<T> {
    func appended(e : T) -> Array<T> { // No mutating keyword!
        self.append(e) // self would either be mutated here if the current ref count of self is 1, and self is either a “rvalue” or self’s old state cannot possibly referenced anymore after this call. Otherwise, "self” would actually be a copy of self.
        return self
    }
}

I think that with such support it is encouraged to make more use of (immutable) value types while not sacrificing performance. Less mutations lead to more understandable code, which leads to less bugs.

In any case, keep up the fast improvements to Swift.

Best regards,
Bram.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Haravikk) #3

Hi all,

Would it be possible to improve value and move semantics (performance) in Swift? Consider this possible Swift code in a future release of Swift:

let array1 : [String] = ["Val1", "Val2"]
let array2 = array1.appended(“Val3”) // Copy of array1 with “Val3” appended. array1 is left untouched. Nothing special yet.
var array3 : [String] = [“Var1”]
array3 = array3.appended(“Var2”) // array3 can just be mutated to add “Var2”, while maintaining value semantics. Swift can recognize that array3’s old state is not referenced anywhere in the future.
let array4 = array2.appended("Val4").appended("Val5") // Copy of array2 with both "Val4" and "Val5" appended. In this case, “Val5” can also be appended by mutation.

Well, for the array3 = array3.appended("Var2") example this could possibly be addressed by an attribute to indicate to the compiler that .appended() has a mutating variant, as this will allow it to issue a warning when the assignment is to the same variable, which would address that simple case (and provide more awareness of the mutating options and encourage developers to use them).

This example illustrates improved value semantics with a string array. But it would be good if this can work with any struct. Maybe via something similar to isUniquelyReferenced? Or maybe you can even have a “smart” self in a non-mutating func in a struct:
struct Array<T> {
    func appended(e : T) -> Array<T> { // No mutating keyword!
        self.append(e) // self would either be mutated here if the current ref count of self is 1, and self is either a “rvalue” or self’s old state cannot possibly referenced anymore after this call. Otherwise, "self” would actually be a copy of self.
        return self
    }
}

I don't know about allowing mutation of self in non-mutating methods, that seems confusing; however, I'd be surprised if the compiler doesn't already detect variables that only exist to create a copy that is discarded.

The compiler should already be trying to inline very simple methods like the common copy -> mutate -> return style of non-mutating implementations, in which case it should be able to identify that a copy is being created only to overwrite the original anyway, so can be eliminated. Do you believe that this isn't currently being done?

···

On 29 Jul 2016, at 17:42, Bram Beernink via swift-evolution <swift-evolution@swift.org> wrote:


(Bram Beernink) #4

Hi Karl and Haravikk,

Thank you for your replies.

I was assuming that the cases I represented are not always optimized for several reasons:
Swift’s book only talks about optimization in the context of arrays, strings and dictionaries. Not in the context of structs in general:
“The description above refers to the “copying” of strings, arrays, and dictionaries. The behavior you see in your code will always be as if a copy took place. However, Swift only performs an actual copy behind the scenes when it is absolutely necessary to do so. Swift manages all value copying to ensure optimal performance, and you should not avoid assignment to try to preempt this optimization.”
Excerpt From: Apple Inc. “The Swift Programming Language (Swift 2.2).” iBooks. https://itun.es/nl/jEUH0.l
In https://github.com/apple/swift/tree/eb27bb65a7c17bd9b4255baee5c4e4f9c214bde6/stdlib/public/core I see
public mutating func append(_ newElement: Element) , line 1268,
using _makeUniqueAndReserveCapacityIfNotUnique() at line 1269, leading me to suspect that to have COW, you have to do additional work.
Doing some manual tests some time ago, isUniquelyReferenced seemed to return false in a case like append_one as https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment mentioned by Karl, meaning that it indeed leads to unnecessary copying.

In any case, https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment does mention that: “Sometimes COW can introduce additional unexpected copies if the user is not careful.” I would argue that what we need is not only COW, but Copy On Write When Necessary, COWWN. In COWWN copies are only made when writing to the shared reference if it is not unique and the shared reference’s old state is still referred to in next statements. So not only is the current reference count taken into account, but also whether the old state is needed afterwards. This is both runtime as well as compile-time data.

So my questions would be:
Why does Swift sometimes do additional unnecessary copying, as implied by https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment in the case of append_one? Is this a problem that cannot be solved? (In C++ you would solve this example using a=a.append_one(std::move(a)). But I would think that since Swift does not have to deal with pointers and manual memory management, it can automatically detect such cases unlike C++?)
If/once structs are COWWN, can Swift introduce immutable functions for the standard library, such as func appended(_ newElement: Element) -> Array<Element>?

Best regards,
Bram.

···

On 30 jul. 2016, at 12:46, Haravikk <swift-evolution@haravikk.me> wrote:

On 29 Jul 2016, at 17:42, Bram Beernink via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hi all,

Would it be possible to improve value and move semantics (performance) in Swift? Consider this possible Swift code in a future release of Swift:

let array1 : [String] = ["Val1", "Val2"]
let array2 = array1.appended(“Val3”) // Copy of array1 with “Val3” appended. array1 is left untouched. Nothing special yet.
var array3 : [String] = [“Var1”]
array3 = array3.appended(“Var2”) // array3 can just be mutated to add “Var2”, while maintaining value semantics. Swift can recognize that array3’s old state is not referenced anywhere in the future.
let array4 = array2.appended("Val4").appended("Val5") // Copy of array2 with both "Val4" and "Val5" appended. In this case, “Val5” can also be appended by mutation.

Well, for the array3 = array3.appended("Var2") example this could possibly be addressed by an attribute to indicate to the compiler that .appended() has a mutating variant, as this will allow it to issue a warning when the assignment is to the same variable, which would address that simple case (and provide more awareness of the mutating options and encourage developers to use them).

This example illustrates improved value semantics with a string array. But it would be good if this can work with any struct. Maybe via something similar to isUniquelyReferenced? Or maybe you can even have a “smart” self in a non-mutating func in a struct:
struct Array<T> {
    func appended(e : T) -> Array<T> { // No mutating keyword!
        self.append(e) // self would either be mutated here if the current ref count of self is 1, and self is either a “rvalue” or self’s old state cannot possibly referenced anymore after this call. Otherwise, "self” would actually be a copy of self.
        return self
    }
}

I don't know about allowing mutation of self in non-mutating methods, that seems confusing; however, I'd be surprised if the compiler doesn't already detect variables that only exist to create a copy that is discarded.

The compiler should already be trying to inline very simple methods like the common copy -> mutate -> return style of non-mutating implementations, in which case it should be able to identify that a copy is being created only to overwrite the original anyway, so can be eliminated. Do you believe that this isn't currently being done?


(Dave Abrahams) #5

Hi Karl and Haravikk,

Thank you for your replies.

I was assuming that the cases I represented are not always optimized for several reasons:
Swift’s book only talks about optimization in the context of arrays,
strings and dictionaries. Not in the context of structs in general:
“The description above refers to the “copying” of strings, arrays, and
dictionaries. The behavior you see in your code will always be as if a
copy took place. However, Swift only performs an actual copy behind
the scenes when it is absolutely necessary to do so. Swift manages all
value copying to ensure optimal performance,

If it says that, it's... not quite right. There are things we could do
to make some value copies more optimal. For example, any value type
containing multiple class references—or multiple other value types (such
as arrays or strings or dictionaries) that contain class references—will
cost more to copy than a single class reference does. At the cost of
some allocation and indirection, we could reduce the copying cost of
such values. It's an optimization we've considered making, but haven't
prioritized.

You can put a CoW wrapper around your value to do it manually. I hacked
one up using ManagedBuffer for someone at WWDC but I don't seem to have
saved the code, sadly.

and you should not avoid assignment to try to preempt this
optimization.”

But that's basically still true. The CoW wrapper technique is a good
way to tune things later if you find it necessary, without distorting
your code.

Excerpt From: Apple Inc. “The Swift Programming Language (Swift 2.2).”
iBooks. https://itun.es/nl/jEUH0.l In
https://github.com/apple/swift/tree/eb27bb65a7c17bd9b4255baee5c4e4f9c214bde6/stdlib/public/core
<https://github.com/apple/swift/tree/eb27bb65a7c17bd9b4255baee5c4e4f9c214bde6/stdlib/public/core>
I see public mutating func append(_ newElement: Element) , line 1268,
using _makeUniqueAndReserveCapacityIfNotUnique() at line 1269, leading
me to suspect that to have COW, you have to do additional work. Doing
some manual tests some time ago, isUniquelyReferenced seemed to return
false in a case like append_one as
https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment
<https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment>
mentioned by Karl, meaning that it indeed leads to unnecessary
copying.

In any case,
https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment
<https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment>
does mention that: “Sometimes COW can introduce additional unexpected
copies if the user is not careful.” I would argue that what we need is
not only COW, but Copy On Write When Necessary, COWWN. In COWWN copies
are only made when writing to the shared reference if it is not unique
and the shared reference’s old state is still referred to in next
statements.

That's what the standard library CoW types do currently, and yours can
too. See isKnownUniquelyReferenced (née isUniquelyReferenced).

So not only is the current reference count taken into account, but
also whether the old state is needed afterwards. This is both runtime
as well as compile-time data.

So my questions would be:
Why does Swift sometimes do additional unnecessary copying, as implied
by
https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment
<https://github.com/apple/swift/blob/master/docs/OptimizationTips.rst#advice-use-inplace-mutation-instead-of-object-reassignment>
in the case of append_one?

In this case it could be needlessly copied because the optimizer is either

a) off (-Onone)
b) not smart enough to know that `a` isn't used before being reassigned

Is this a problem that cannot be solved?

I think this particular example can be solved. There are other cases
where avoiding a needless copy is impossible due to the existence of a
separate compilation boundary (e.g. across frameworks, or files if
whole-module-optimization is disabled).

(In C++ you would solve this example using
a=a.append_one(std::move(a)). But I would think that since Swift does
not have to deal with pointers and manual memory management, it can
automatically detect such cases unlike C++?)

In principle, yes. In practice, it depends on visibility through
function call boundaries.

If/once structs are COWWN, can Swift introduce immutable functions for
the standard library, such as func appended(_ newElement: Element) ->
Array<Element>?

That would be “appending.” We could introduce that method today, though
I'm not sure it's useful enough to justify its existence.

···

on Sun Jul 31 2016, Bram Beernink <swift-evolution@swift.org> wrote:

Best regards,
Bram.

On 30 jul. 2016, at 12:46, Haravikk <swift-evolution@haravikk.me> wrote:

On 29 Jul 2016, at 17:42, Bram Beernink via swift-evolution >>> <swift-evolution@swift.org >>> <mailto:swift-evolution@swift.org>> >>> wrote:

Hi all,

Would it be possible to improve value and move semantics
(performance) in Swift? Consider this possible Swift code in a
future release of Swift:

let array1 : [String] = ["Val1", "Val2"]
let array2 = array1.appended(“Val3”) // Copy of array1 with “Val3”
appended. array1 is left untouched. Nothing special yet.
var array3 : [String] = [“Var1”]
array3 = array3.appended(“Var2”) // array3 can just be mutated to
add “Var2”, while maintaining value semantics. Swift can recognize
that array3’s old state is not referenced anywhere in the future.
let array4 = array2.appended("Val4").appended("Val5") // Copy of
array2 with both "Val4" and "Val5" appended. In this case, “Val5”
can also be appended by mutation.

Well, for the array3 = array3.appended("Var2") example this could
possibly be addressed by an attribute to indicate to the compiler
that .appended() has a mutating variant, as this will allow it to
issue a warning when the assignment is to the same variable, which
would address that simple case (and provide more awareness of the
mutating options and encourage developers to use them).

This example illustrates improved value semantics with a string array. But it would be good if this can work with any struct. Maybe via something similar to isUniquelyReferenced? Or maybe you can even have a “smart” self in a non-mutating func in a struct:
struct Array<T> {
    func appended(e : T) -> Array<T> { // No mutating keyword!
        self.append(e) // self would either be mutated here if the current ref count of self is 1, and self is either a “rvalue” or self’s old state cannot possibly referenced anymore after this call. Otherwise, "self” would actually be a copy of self.
        return self
    }
}

I don't know about allowing mutation of self in non-mutating
methods, that seems confusing; however, I'd be surprised if the
compiler doesn't already detect variables that only exist to create
a copy that is discarded.

The compiler should already be trying to inline very simple methods
like the common copy -> mutate -> return style of non-mutating
implementations, in which case it should be able to identify that a
copy is being created only to overwrite the original anyway, so can
be eliminated. Do you believe that this isn't currently being done?

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave


(Brent Royal-Gordon) #6

Slightly off-topic, but one day I would like to see `indirect` turned into a generalized COW feature:

* `indirect` can only be applied to a value type (or at least to a type with `mutating` members, so reference types would have to gain those).
* The value type is boxed in a reference type.
* Any use of a mutating member (and thus, use of the setter) is guarded with `isKnownUniquelyReferenced` and a copy.
* `indirect` can be applied to an enum case with a payload (the payload is boxed), a stored property (the value is boxed), or a type (the entire type is boxed).

Then you can just slap `indirect` on a struct whose copying is too complicated and let Swift transparently COW it for you. (And it would also permit recursive structs and other such niceties.)

···

On Aug 2, 2016, at 12:06 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

If it says that, it's... not quite right. There are things we could do
to make some value copies more optimal. For example, any value type
containing multiple class references—or multiple other value types (such
as arrays or strings or dictionaries) that contain class references—will
cost more to copy than a single class reference does. At the cost of
some allocation and indirection, we could reduce the copying cost of
such values. It's an optimization we've considered making, but haven't
prioritized.

You can put a CoW wrapper around your value to do it manually. I hacked
one up using ManagedBuffer for someone at WWDC but I don't seem to have
saved the code, sadly.

--
Brent Royal-Gordon
Architechies


(Dave Abrahams) #7

My vision for this feature is:

a. We indirect automatically based on some heuristic, as an
   optimization.

b. We allow you to indirect manually.

c. We provide an attribute that suppresses automatic indirection to
   whatever depth possible given resilience boundaries.

···

on Tue Aug 02 2016, Brent Royal-Gordon <brent-AT-architechies.com> wrote:

On Aug 2, 2016, at 12:06 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

If it says that, it's... not quite right. There are things we could do
to make some value copies more optimal. For example, any value type
containing multiple class references—or multiple other value types (such
as arrays or strings or dictionaries) that contain class references—will
cost more to copy than a single class reference does. At the cost of
some allocation and indirection, we could reduce the copying cost of
such values. It's an optimization we've considered making, but haven't
prioritized.

You can put a CoW wrapper around your value to do it manually. I hacked
one up using ManagedBuffer for someone at WWDC but I don't seem to have
saved the code, sadly.

Slightly off-topic, but one day I would like to see `indirect` turned
into a generalized COW feature:

* `indirect` can only be applied to a value type (or at least to a
type with `mutating` members, so reference types would have to gain
those).
* The value type is boxed in a reference type.
* Any use of a mutating member (and thus, use of the setter) is
guarded with `isKnownUniquelyReferenced` and a copy.
* `indirect` can be applied to an enum case with a payload (the
payload is boxed), a stored property (the value is boxed), or a type
(the entire type is boxed).

Then you can just slap `indirect` on a struct whose copying is too
complicated and let Swift transparently COW it for you. (And it would
also permit recursive structs and other such niceties.)

--
-Dave


(Matthew Johnson) #8

If it says that, it's... not quite right. There are things we could do
to make some value copies more optimal. For example, any value type
containing multiple class references—or multiple other value types (such
as arrays or strings or dictionaries) that contain class references—will
cost more to copy than a single class reference does. At the cost of
some allocation and indirection, we could reduce the copying cost of
such values. It's an optimization we've considered making, but haven't
prioritized.

You can put a CoW wrapper around your value to do it manually. I hacked
one up using ManagedBuffer for someone at WWDC but I don't seem to have
saved the code, sadly.

Slightly off-topic, but one day I would like to see `indirect` turned
into a generalized COW feature:

* `indirect` can only be applied to a value type (or at least to a
type with `mutating` members, so reference types would have to gain
those).
* The value type is boxed in a reference type.
* Any use of a mutating member (and thus, use of the setter) is
guarded with `isKnownUniquelyReferenced` and a copy.
* `indirect` can be applied to an enum case with a payload (the
payload is boxed), a stored property (the value is boxed), or a type
(the entire type is boxed).

Then you can just slap `indirect` on a struct whose copying is too
complicated and let Swift transparently COW it for you. (And it would
also permit recursive structs and other such niceties.)

My vision for this feature is:

a. We indirect automatically based on some heuristic, as an
  optimization.

b. We allow you to indirect manually.

c. We provide an attribute that suppresses automatic indirection to
  whatever depth possible given resilience boundaries.

This all sounds great. Does any of this fit into Swift 4 (either phase 1 or phase 2)? It seems like at least the automatic part would have ABI impact.

···

On Aug 2, 2016, at 4:54 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Tue Aug 02 2016, Brent Royal-Gordon <brent-AT-architechies.com> wrote:

On Aug 2, 2016, at 12:06 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #9

Yes. In principle, all of it has the potential to fit in Swift 4. I'm
not sure what will actually happen of course.

···

on Wed Aug 03 2016, Matthew Johnson <swift-evolution@swift.org> wrote:

On Aug 2, 2016, at 4:54 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

on Tue Aug 02 2016, Brent Royal-Gordon <brent-AT-architechies.com> wrote:

On Aug 2, 2016, at 12:06 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

If it says that, it's... not quite right. There are things we could do
to make some value copies more optimal. For example, any value type
containing multiple class references—or multiple other value types (such
as arrays or strings or dictionaries) that contain class references—will
cost more to copy than a single class reference does. At the cost of
some allocation and indirection, we could reduce the copying cost of
such values. It's an optimization we've considered making, but haven't
prioritized.

You can put a CoW wrapper around your value to do it manually. I hacked
one up using ManagedBuffer for someone at WWDC but I don't seem to have
saved the code, sadly.

Slightly off-topic, but one day I would like to see `indirect` turned
into a generalized COW feature:

* `indirect` can only be applied to a value type (or at least to a
type with `mutating` members, so reference types would have to gain
those).
* The value type is boxed in a reference type.
* Any use of a mutating member (and thus, use of the setter) is
guarded with `isKnownUniquelyReferenced` and a copy.
* `indirect` can be applied to an enum case with a payload (the
payload is boxed), a stored property (the value is boxed), or a type
(the entire type is boxed).

Then you can just slap `indirect` on a struct whose copying is too
complicated and let Swift transparently COW it for you. (And it would
also permit recursive structs and other such niceties.)

My vision for this feature is:

a. We indirect automatically based on some heuristic, as an
  optimization.

b. We allow you to indirect manually.

c. We provide an attribute that suppresses automatic indirection to
  whatever depth possible given resilience boundaries.

This all sounds great. Does any of this fit into Swift 4 (either
phase 1 or phase 2)? It seems like at least the automatic part would
have ABI impact.

--
-Dave


(Chris Lattner) #10

Then you can just slap `indirect` on a struct whose copying is too
complicated and let Swift transparently COW it for you. (And it would
also permit recursive structs and other such niceties.)

My vision for this feature is:

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

b. We allow you to indirect manually.

c. We provide an attribute that suppresses automatic indirection to
whatever depth possible given resilience boundaries.

This all sounds great. Does any of this fit into Swift 4 (either phase 1 or phase 2)? It seems like at least the automatic part would have ABI impact.

This is very low priority, because it is generally additive. After the major topics for ABI stability are figured out, we can have the philosophic discussion about the automatic part.

-Chris

···

On Aug 3, 2016, at 6:41 AM, Matthew Johnson via swift-evolution <swift-evolution@swift.org> wrote:


(Matthew Johnson) #11

If it says that, it's... not quite right. There are things we could do
to make some value copies more optimal. For example, any value type
containing multiple class references—or multiple other value types (such
as arrays or strings or dictionaries) that contain class references—will
cost more to copy than a single class reference does. At the cost of
some allocation and indirection, we could reduce the copying cost of
such values. It's an optimization we've considered making, but haven't
prioritized.

You can put a CoW wrapper around your value to do it manually. I hacked
one up using ManagedBuffer for someone at WWDC but I don't seem to have
saved the code, sadly.

Slightly off-topic, but one day I would like to see `indirect` turned
into a generalized COW feature:

* `indirect` can only be applied to a value type (or at least to a
type with `mutating` members, so reference types would have to gain
those).
* The value type is boxed in a reference type.
* Any use of a mutating member (and thus, use of the setter) is
guarded with `isKnownUniquelyReferenced` and a copy.
* `indirect` can be applied to an enum case with a payload (the
payload is boxed), a stored property (the value is boxed), or a type
(the entire type is boxed).

Then you can just slap `indirect` on a struct whose copying is too
complicated and let Swift transparently COW it for you. (And it would
also permit recursive structs and other such niceties.)

My vision for this feature is:

a. We indirect automatically based on some heuristic, as an
optimization.

b. We allow you to indirect manually.

c. We provide an attribute that suppresses automatic indirection to
whatever depth possible given resilience boundaries.

This all sounds great. Does any of this fit into Swift 4 (either
phase 1 or phase 2)? It seems like at least the automatic part would
have ABI impact.

Yes. In principle, all of it has the potential to fit in Swift 4. I'm
not sure what will actually happen of course.

Of course. :slight_smile:

I asked mostly because I am wondering when it might be appropriate to start discussing these topics in more detail, and specifically if they fit into the first phase of Swift 4 whether we should start a thread now.

···

On Aug 3, 2016, at 3:48 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Wed Aug 03 2016, Matthew Johnson <swift-evolution@swift.org> wrote:

On Aug 2, 2016, at 4:54 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Tue Aug 02 2016, Brent Royal-Gordon <brent-AT-architechies.com> wrote:

On Aug 2, 2016, at 12:06 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Joe Groff) #12

Then you can just slap `indirect` on a struct whose copying is too
complicated and let Swift transparently COW it for you. (And it would
also permit recursive structs and other such niceties.)

My vision for this feature is:

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.

-Joe

···

On Aug 3, 2016, at 4:58 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org> wrote:
On Aug 3, 2016, at 6:41 AM, Matthew Johnson via swift-evolution <swift-evolution@swift.org> wrote:

b. We allow you to indirect manually.

c. We provide an attribute that suppresses automatic indirection to
whatever depth possible given resilience boundaries.

This all sounds great. Does any of this fit into Swift 4 (either phase 1 or phase 2)? It seems like at least the automatic part would have ABI impact.

This is very low priority, because it is generally additive. After the major topics for ABI stability are figured out, we can have the philosophic discussion about the automatic part.

-Chris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Chris Lattner) #13

I understand with much of your motivation, but I still disagree with your conclusion. I see this as exactly analogous to the situation and discussion when we added indirect to enums. At the time, some argued for a magic model where the compiler figured out what to do in the most common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect” keyword added to the enum decl itself. In the less common case, you want the “boilerplate” so that you know where the indirections are happening.

Overall, I think this model has worked well for enums and I’m still very happy with it. If you generalize it to structs, you also have to consider that this should be part of a larger model that includes better support for COW. I think it would be really unfortunate to “magically indirect” struct, when the right answer may actually be to COW them instead. I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to reiterate, this really isn’t the time to have this debate, since it is clearly outside of stage 1.

-Chris

···

On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.


(Chris Lattner) #14

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.

I understand with much of your motivation, but I still disagree with your conclusion.

^I understand and agree with much of your motivation...

-Chris

···

On Aug 3, 2016, at 8:46 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org> wrote:
On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

I see this as exactly analogous to the situation and discussion when we added indirect to enums. At the time, some argued for a magic model where the compiler figured out what to do in the most common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect” keyword added to the enum decl itself. In the less common case, you want the “boilerplate” so that you know where the indirections are happening.

Overall, I think this model has worked well for enums and I’m still very happy with it. If you generalize it to structs, you also have to consider that this should be part of a larger model that includes better support for COW. I think it would be really unfortunate to “magically indirect” struct, when the right answer may actually be to COW them instead. I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to reiterate, this really isn’t the time to have this debate, since it is clearly outside of stage 1.

-Chris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Goffredo Marocchi) #15

While I understand both your position and Joe's one, I think that it is good if in the Swift community at large, outside of this mailing list itself, more thought was given also to the side effects / losses in moving everything over to value types over references (one could have said with careful use of copy constructors and other solutions reference types can be tamed so why bother with value type renaissance? People seemed to make a lot of production code before...) and how those can and should be tamed.

Value types and the impact on performance (copies were advertised as almost free in Dave's talks but even when implementing CoW smartly like you have done with Array, Dictionary, etc... this still may mean surprising large copies happening at times some users may not expect them to be).

Regardless of the outcome I still see debate on this and putting yesterday's values to the test of today's data and knowledge pragmatic and it should not be seen as raining on the language's parade :).

Sorry for the rant-ish nature of my reply.

···

Sent from my iPhone

On 4 Aug 2016, at 04:46, Chris Lattner via swift-evolution <swift-evolution@swift.org> wrote:

On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.

I understand with much of your motivation, but I still disagree with your conclusion. I see this as exactly analogous to the situation and discussion when we added indirect to enums. At the time, some argued for a magic model where the compiler figured out what to do in the most common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect” keyword added to the enum decl itself. In the less common case, you want the “boilerplate” so that you know where the indirections are happening.

Overall, I think this model has worked well for enums and I’m still very happy with it. If you generalize it to structs, you also have to consider that this should be part of a larger model that includes better support for COW. I think it would be really unfortunate to “magically indirect” struct, when the right answer may actually be to COW them instead. I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to reiterate, this really isn’t the time to have this debate, since it is clearly outside of stage 1.

-Chris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Joe Groff) #16

In my mind, indirect *is* cow. An indirect struct without value semantics is a class, so there would be no reason to implement 'indirect' for structs without providing copy-on-write behavior. I believe that the situation with structs and enums is also different. Indirecting enums has a bigger impact on interface because they enable recursive data structures, and while there are places where indirecting a struct may make new recursion possible, that's much rarer of a reason to introduce indirectness for structs. Performance and code size are the more common reasons, and we've described how to build COW boxes manually to work around performance problems at the last two years' WWDC. There are pretty good heuristics for when indirection almost always beats inline storage: once you have more than one refcounted field, passing around a box and retaining once becomes cheaper than retaining the fields individually. Once you exceed the fixed-sized buffer threshold of three words, indirecting some or all of your fields becomes necessary to avoid falling off a cliff in unspecialized generic or protocol-type-based code. Considering that we hope to explore other layout optimizations, such as automatically reordering fields to minimize padding, and that, as with padding, there are simple rules for indirecting that can be mechanically followed to get good results in the 99% case, it seems perfectly reasonable to me to automate this.

-Joe

···

On Aug 3, 2016, at 8:46 PM, Chris Lattner <clattner@apple.com> wrote:

On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.

I understand with much of your motivation, but I still disagree with your conclusion. I see this as exactly analogous to the situation and discussion when we added indirect to enums. At the time, some argued for a magic model where the compiler figured out what to do in the most common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect” keyword added to the enum decl itself. In the less common case, you want the “boilerplate” so that you know where the indirections are happening.

Overall, I think this model has worked well for enums and I’m still very happy with it. If you generalize it to structs, you also have to consider that this should be part of a larger model that includes better support for COW. I think it would be really unfortunate to “magically indirect” struct, when the right answer may actually be to COW them instead. I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to reiterate, this really isn’t the time to have this debate, since it is clearly outside of stage 1.


(Dave Abrahams) #17

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we
provide a predictable model. I’d rather the user get what they
write, and tell people to write ‘indirect’ as a performance tuning
option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important
to the way people are writing Swift in practice. We've seen many
users fully invest in value semantics types, because they wants the
benefits of isolated state, without appreciating the code size and
performance impacts. Furthermore, implementing 'indirect' by hand
is a lot of boilerplate. Putting indirectness entirely in users'
hands feels to me a lot like the "value if word sized, const& if
struct" heuristics C++ makes you internalize, since there are
similar heuristics where 'indirect' is almost always a win in Swift
too.

I understand with much of your motivation, but I still disagree with your conclusion.

^I understand and agree with much of your motivation...

-Chris

I see this as exactly analogous to the situation and discussion
when we added indirect to enums. At the time, some argued for a
magic model where the compiler figured out what to do in the most
common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the
keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about
where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect”
keyword added to the enum decl itself. In the less common case, you
want the “boilerplate” so that you know where the indirections are
happening.

Overall, I think this model has worked well for enums and I’m still
very happy with it. If you generalize it to structs, you also have
to consider that this should be part of a larger model that includes
better support for COW. I think it would be really unfortunate to
“magically indirect” struct, when the right answer may actually be
to COW them instead.

COW'ing the struct is implied by indirecting it. You're not allowed to
break its value semantics just because it's being stored indirectly, and
we're darned sure not going to introduce an eagerly copied box there; I
think we all agree that the eager boxes we currently have must evolve
into COWs before ABI stability sets in.

I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to
reiterate, this really isn’t the time to have this debate, since it
is clearly outside of stage 1.

Well, I don't want to draw it out either, but I do want to add one
point: the “predictable performance” argument rings pretty hollow for
me. There are already hard-to-anticipate performance cliffs wherever we
have an inline buffer (e.g. existentials), an opportunity for stack
promotion, or where we mutate a COW data structure that might turn out
to have non-uniquely-referenced storage. All of these effects add up to
code that performs well without too much intervention in most cases. We
should continue to make Swift perform well automatically, and give
people the tools they need to make adjustments when profiling reveals an
issue.

···

on Wed Aug 03 2016, Chris Lattner <clattner-AT-apple.com> wrote:

On Aug 3, 2016, at 8:46 PM, Chris Lattner via swift-evolution >> <swift-evolution@swift.org> wrote:
On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

--
-Dave


(Dave Abrahams) #18

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we
provide a predictable model. I’d rather the user get what they
write, and tell people to write ‘indirect’ as a performance tuning
option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important
to the way people are writing Swift in practice. We've seen many
users fully invest in value semantics types, because they wants the
benefits of isolated state, without appreciating the code size and
performance impacts. Furthermore, implementing 'indirect' by hand
is a lot of boilerplate. Putting indirectness entirely in users'
hands feels to me a lot like the "value if word sized, const& if
struct" heuristics C++ makes you internalize, since there are
similar heuristics where 'indirect' is almost always a win in Swift
too.

I understand with much of your motivation, but I still disagree with your conclusion.

^I understand and agree with much of your motivation...

-Chris

I see this as exactly analogous to the situation and discussion
when we added indirect to enums. At the time, some argued for a
magic model where the compiler figured out what to do in the most
common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the
keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about
where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect”
keyword added to the enum decl itself. In the less common case, you
want the “boilerplate” so that you know where the indirections are
happening.

Overall, I think this model has worked well for enums and I’m still
very happy with it. If you generalize it to structs, you also have
to consider that this should be part of a larger model that includes
better support for COW. I think it would be really unfortunate to
“magically indirect” struct, when the right answer may actually be
to COW them instead.

COW'ing the struct is implied by indirecting it. You're not allowed to
break its value semantics just because it's being stored indirectly, and
we're darned sure not going to introduce an eagerly copied box there; I
think we all agree that the eager boxes we currently have must evolve
into COWs before ABI stability sets in.

I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to
reiterate, this really isn’t the time to have this debate, since it
is clearly outside of stage 1.

Well, I don't want to draw it out either, but I do want to add one
point: the “predictable performance” argument rings pretty hollow for
me. There are already hard-to-anticipate performance cliffs wherever we
have an inline buffer (e.g. existentials), an opportunity for stack
promotion, or where we mutate a COW data structure that might turn out
to have non-uniquely-referenced storage. All of these effects add up to
code that performs well without too much intervention in most cases. We
should continue to make Swift perform well automatically, and give
people the tools they need to make adjustments when profiling reveals an
issue.

···

on Wed Aug 03 2016, Chris Lattner <clattner-AT-apple.com> wrote:

On Aug 3, 2016, at 8:46 PM, Chris Lattner via swift-evolution >> <swift-evolution@swift.org> wrote:
On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

--
-Dave


(Matthew Johnson) #19

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.

I understand with much of your motivation, but I still disagree with your conclusion. I see this as exactly analogous to the situation and discussion when we added indirect to enums. At the time, some argued for a magic model where the compiler figured out what to do in the most common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect” keyword added to the enum decl itself. In the less common case, you want the “boilerplate” so that you know where the indirections are happening.

Overall, I think this model has worked well for enums and I’m still very happy with it. If you generalize it to structs, you also have to consider that this should be part of a larger model that includes better support for COW. I think it would be really unfortunate to “magically indirect” struct, when the right answer may actually be to COW them instead. I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to reiterate, this really isn’t the time to have this debate, since it is clearly outside of stage 1.

In my mind, indirect *is* cow. An indirect struct without value semantics is a class, so there would be no reason to implement 'indirect' for structs without providing copy-on-write behavior.

This is my view as well. Chris, what is the distinction in your mind?

I believe that the situation with structs and enums is also different. Indirecting enums has a bigger impact on interface because they enable recursive data structures, and while there are places where indirecting a struct may make new recursion possible, that's much rarer of a reason to introduce indirectness for structs. Performance and code size are the more common reasons, and we've described how to build COW boxes manually to work around performance problems at the last two years' WWDC. There are pretty good heuristics for when indirection almost always beats inline storage: once you have more than one refcounted field, passing around a box and retaining once becomes cheaper than retaining the fields individually. Once you exceed the fixed-sized buffer threshold of three words, indirecting some or all of your fields becomes necessary to avoid falling off a cliff in unspecialized generic or protocol-type-based code. Considering that we hope to explore other layout optimizations, such as automatically reordering fields to minimize padding, and that, as with padding, there are simple rules for indirecting that can be mechanically followed to get good results in the 99% case, it seems perfectly reasonable to me to automate this.

-Joe

I think everyone is making good points in this discussion. Predictability is an important value, but so is default performance. To some degree there is a natural tension between them, but I think it can be mitigated.

Swift relies so heavily on the optimizer for performance that I don’t think the default performance is ever going to be perfectly predictable. But that’s actually a good thing because, as this allows the compiler to provide *better* performance for unannotated code than it would otherwise be able to do. We should strive to make the default characteristics, behaviors, heuristics, etc as predictable as possible without compromising the goal of good performance by default. We’re already pretty fair down this path. It’s not clear to me why indirect value types would be treated any differently. I don’t think anyone will complain as long as it is very rare for performance to be *worse* than the 100% predictable choice (always inline in this case).

It seems reasonable to me to expect developers who are reasoning about relatively low level performance details (i.e. not Big-O performance) to understand some lower level details of the language defaults. It is also important to offer tools for developers to take direct, manual control when desired to make performance and behavior as predictable as possible.

For example, if we commit to and document the size of the inline existential buffer it is possible to reason about whether or not a value type is small enough to fit. If the indirection heuristic is relatively simple - such as exceeding the inline buffer size, having more than one ref counted field (including types implemented with CoW), etc the default behavior will still be reasonably predictable. These commitments don’t necessarily need to cover *every* case and don’t necessarily need to happen immediately, but hopefully the language will reach a stage of maturity where the core team feels confident in committing to some of the details that are relevant to common use cases.

We just need to also support users that want / need complete predictability and optimal performance for their specific use case by allowing opt-in annotations that offer more precise control.

Matthew

···

On Aug 4, 2016, at 9:39 AM, Joe Groff <jgroff@apple.com> wrote:

On Aug 3, 2016, at 8:46 PM, Chris Lattner <clattner@apple.com> wrote:
On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:


(Johannes Neubauer) #20

I agree with this. First: IMHO indirect *should be* CoW, but currently it is not. If a value does not fit into the value buffer of an existential container, the value will be put onto the heap. If you store the same value into a second existential container (via an assignment to a variable of protocol type), it will be copied and put *as a second indirectly stored value* onto the heap, although no write has happened at all. Arnold Schwaighofer explained that in his talk at WWDC2016 very good (if you need a link, just ask me).

If there will be an automatic mechanism for indirect storage *and* CoW (which I would love), of course there have to be „tradeoff heuristics“ for when to store a value directly and when to use indirect storage. Further on, there should be a *unique value pool* for each value type where all (currently used) values of that type are stored (uniquely). I would even prefer, that the „tradeoff heuristics“ are done upfront by the compiler for a type, not for a variable. That means, Swift would use always a container for value types, but there are two types of containers: the value container and the existential container. The existential container stays like it is. The value container is as big as it needs to be to store the value of the given type, for small values (at most as big as the value buffer). If the value is bigger than the value buffer (or has more than one association to a reference type) the value container for this type is only as big as a reference, because these type will then stored on the heap with CoW **always**. This way I can always assign a value to a variable typed with a protocol, since value (or reference) will fit into the value buffer of the existential container. Additionally, CoW is available automatically for all types for which it „makes sense“ (of course annotations should be available to turn to the current „behavior“ if someone does not like this automatism. Last but not least, using the *unique value pool* for all value types, that fall into the category CoW-abonga this will be very space efficient.

Of course, if you create a new value of such a CoW-type, you need an *atomic lookup and set operation* in the value pool first checking whether it is already there (therefore a good (default) implementation of equality and hashable is a prerequisite) and either use the available value or in the other case add the new value to the pool.

Such a value pool could even be used system-wide (some languages do this for Strings, Ints and other value types). These values have to be evicted if their reference count drops to `0`. For some values permanent storage or storage for some time even if they are currently not referenced like in a cache could be implemented in order to reduce heap allocations (e.g. Java does this for primitive type wrapper instances for boxing and unboxing).

I would really love this. It would affect ABI, so it is a (potential) candidate for Swift 4 Phase 1 right?

All the best
Johannes

···

Am 04.08.2016 um 17:26 schrieb Matthew Johnson via swift-evolution <swift-evolution@swift.org>:

On Aug 4, 2016, at 9:39 AM, Joe Groff <jgroff@apple.com> wrote:

On Aug 3, 2016, at 8:46 PM, Chris Lattner <clattner@apple.com> wrote:

On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.

I understand with much of your motivation, but I still disagree with your conclusion. I see this as exactly analogous to the situation and discussion when we added indirect to enums. At the time, some argued for a magic model where the compiler figured out what to do in the most common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect” keyword added to the enum decl itself. In the less common case, you want the “boilerplate” so that you know where the indirections are happening.

Overall, I think this model has worked well for enums and I’m still very happy with it. If you generalize it to structs, you also have to consider that this should be part of a larger model that includes better support for COW. I think it would be really unfortunate to “magically indirect” struct, when the right answer may actually be to COW them instead. I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to reiterate, this really isn’t the time to have this debate, since it is clearly outside of stage 1.

In my mind, indirect *is* cow. An indirect struct without value semantics is a class, so there would be no reason to implement 'indirect' for structs without providing copy-on-write behavior.

This is my view as well. Chris, what is the distinction in your mind?

I believe that the situation with structs and enums is also different. Indirecting enums has a bigger impact on interface because they enable recursive data structures, and while there are places where indirecting a struct may make new recursion possible, that's much rarer of a reason to introduce indirectness for structs. Performance and code size are the more common reasons, and we've described how to build COW boxes manually to work around performance problems at the last two years' WWDC. There are pretty good heuristics for when indirection almost always beats inline storage: once you have more than one refcounted field, passing around a box and retaining once becomes cheaper than retaining the fields individually. Once you exceed the fixed-sized buffer threshold of three words, indirecting some or all of your fields becomes necessary to avoid falling off a cliff in unspecialized generic or protocol-type-based code. Considering that we hope to explore other layout optimizations, such as automatically reordering fields to minimize padding, and that, as with padding, there are simple rules for indirecting that can be mechanically followed to get good results in the 99% case, it seems perfectly reasonable to me to automate this.

-Joe

I think everyone is making good points in this discussion. Predictability is an important value, but so is default performance. To some degree there is a natural tension between them, but I think it can be mitigated.

Swift relies so heavily on the optimizer for performance that I don’t think the default performance is ever going to be perfectly predictable. But that’s actually a good thing because, as this allows the compiler to provide *better* performance for unannotated code than it would otherwise be able to do. We should strive to make the default characteristics, behaviors, heuristics, etc as predictable as possible without compromising the goal of good performance by default. We’re already pretty fair down this path. It’s not clear to me why indirect value types would be treated any differently. I don’t think anyone will complain as long as it is very rare for performance to be *worse* than the 100% predictable choice (always inline in this case).

It seems reasonable to me to expect developers who are reasoning about relatively low level performance details (i.e. not Big-O performance) to understand some lower level details of the language defaults. It is also important to offer tools for developers to take direct, manual control when desired to make performance and behavior as predictable as possible.

For example, if we commit to and document the size of the inline existential buffer it is possible to reason about whether or not a value type is small enough to fit. If the indirection heuristic is relatively simple - such as exceeding the inline buffer size, having more than one ref counted field (including types implemented with CoW), etc the default behavior will still be reasonably predictable. These commitments don’t necessarily need to cover *every* case and don’t necessarily need to happen immediately, but hopefully the language will reach a stage of maturity where the core team feels confident in committing to some of the details that are relevant to common use cases.

We just need to also support users that want / need complete predictability and optimal performance for their specific use case by allowing opt-in annotations that offer more precise control.