Improved value and move semantics

To my knowledge no other language than Swift offers this form of custom value types, that can implement protocols, and the need for CoW for "big values" is apparent. So why do you think, that only because the tradeoff in other languages (like Java) which have only a limited set of fixed value types (and a completely different memory model and with a virtual machine instead of LLVM) did not pay off, should not be worth evaluating in Swift?

I think this is comparing apples to oranges (and the people from Apple don’t like this ;) ). Further on, the Java pals are working on custom value types too (the Valhalla project)…

All the best
Johannes

···

Am 04.08.2016 um 20:21 schrieb Joe Groff <jgroff@apple.com>:

On Aug 4, 2016, at 11:20 AM, Johannes Neubauer <neubauer@kingsware.de> wrote:

Am 04.08.2016 um 17:26 schrieb Matthew Johnson via swift-evolution <swift-evolution@swift.org>:

On Aug 4, 2016, at 9:39 AM, Joe Groff <jgroff@apple.com> wrote:

On Aug 3, 2016, at 8:46 PM, Chris Lattner <clattner@apple.com> wrote:

On Aug 3, 2016, at 7:57 PM, Joe Groff <jgroff@apple.com> wrote:

a. We indirect automatically based on some heuristic, as an
optimization.

I weakly disagree with this, because it is important that we provide a predictable model. I’d rather the user get what they write, and tell people to write ‘indirect’ as a performance tuning option. “Too magic” is bad.

I think 'indirect' structs with a heuristic default are important to the way people are writing Swift in practice. We've seen many users fully invest in value semantics types, because they wants the benefits of isolated state, without appreciating the code size and performance impacts. Furthermore, implementing 'indirect' by hand is a lot of boilerplate. Putting indirectness entirely in users' hands feels to me a lot like the "value if word sized, const& if struct" heuristics C++ makes you internalize, since there are similar heuristics where 'indirect' is almost always a win in Swift too.

I understand with much of your motivation, but I still disagree with your conclusion. I see this as exactly analogous to the situation and discussion when we added indirect to enums. At the time, some argued for a magic model where the compiler figured out what to do in the most common “obvious” cases.

We agreed to use our current model though because:
1) Better to be explicit about allocations & indirection that implicit.
2) The compiler can guide the user in the “obvious” case to add the keyword with a fixit, preserving the discoverability / ease of use.
3) When indirection is necessary, there are choices to make about where the best place to do it is.
4) In the most common case, the “boilerplate” is a single “indirect” keyword added to the enum decl itself. In the less common case, you want the “boilerplate” so that you know where the indirections are happening.

Overall, I think this model has worked well for enums and I’m still very happy with it. If you generalize it to structs, you also have to consider that this should be part of a larger model that includes better support for COW. I think it would be really unfortunate to “magically indirect” struct, when the right answer may actually be to COW them instead. I’d rather have a model where someone can use:

// simple, predictable, always inline, slow in some cases.
struct S1 { … }

And then upgrade to one of:

indirect struct S2 {…}
cow struct S3 { … }

Depending on the structure of their data. In any case, to reiterate, this really isn’t the time to have this debate, since it is clearly outside of stage 1.

In my mind, indirect *is* cow. An indirect struct without value semantics is a class, so there would be no reason to implement 'indirect' for structs without providing copy-on-write behavior.

This is my view as well. Chris, what is the distinction in your mind?

I believe that the situation with structs and enums is also different. Indirecting enums has a bigger impact on interface because they enable recursive data structures, and while there are places where indirecting a struct may make new recursion possible, that's much rarer of a reason to introduce indirectness for structs. Performance and code size are the more common reasons, and we've described how to build COW boxes manually to work around performance problems at the last two years' WWDC. There are pretty good heuristics for when indirection almost always beats inline storage: once you have more than one refcounted field, passing around a box and retaining once becomes cheaper than retaining the fields individually. Once you exceed the fixed-sized buffer threshold of three words, indirecting some or all of your fields becomes necessary to avoid falling off a cliff in unspecialized generic or protocol-type-based code. Considering that we hope to explore other layout optimizations, such as automatically reordering fields to minimize padding, and that, as with padding, there are simple rules for indirecting that can be mechanically followed to get good results in the 99% case, it seems perfectly reasonable to me to automate this.

-Joe

I think everyone is making good points in this discussion. Predictability is an important value, but so is default performance. To some degree there is a natural tension between them, but I think it can be mitigated.

Swift relies so heavily on the optimizer for performance that I don’t think the default performance is ever going to be perfectly predictable. But that’s actually a good thing because, as this allows the compiler to provide *better* performance for unannotated code than it would otherwise be able to do. We should strive to make the default characteristics, behaviors, heuristics, etc as predictable as possible without compromising the goal of good performance by default. We’re already pretty fair down this path. It’s not clear to me why indirect value types would be treated any differently. I don’t think anyone will complain as long as it is very rare for performance to be *worse* than the 100% predictable choice (always inline in this case).

It seems reasonable to me to expect developers who are reasoning about relatively low level performance details (i.e. not Big-O performance) to understand some lower level details of the language defaults. It is also important to offer tools for developers to take direct, manual control when desired to make performance and behavior as predictable as possible.

For example, if we commit to and document the size of the inline existential buffer it is possible to reason about whether or not a value type is small enough to fit. If the indirection heuristic is relatively simple - such as exceeding the inline buffer size, having more than one ref counted field (including types implemented with CoW), etc the default behavior will still be reasonably predictable. These commitments don’t necessarily need to cover *every* case and don’t necessarily need to happen immediately, but hopefully the language will reach a stage of maturity where the core team feels confident in committing to some of the details that are relevant to common use cases.

We just need to also support users that want / need complete predictability and optimal performance for their specific use case by allowing opt-in annotations that offer more precise control.

I agree with this. First: IMHO indirect *should be* CoW, but currently it is not. If a value does not fit into the value buffer of an existential container, the value will be put onto the heap. If you store the same value into a second existential container (via an assignment to a variable of protocol type), it will be copied and put *as a second indirectly stored value* onto the heap, although no write has happened at all. Arnold Schwaighofer explained that in his talk at WWDC2016 very good (if you need a link, just ask me).

If there will be an automatic mechanism for indirect storage *and* CoW (which I would love), of course there have to be „tradeoff heuristics“ for when to store a value directly and when to use indirect storage. Further on, there should be a *unique value pool* for each value type where all (currently used) values of that type are stored (uniquely). I would even prefer, that the „tradeoff heuristics“ are done upfront by the compiler for a type, not for a variable. That means, Swift would use always a container for value types, but there are two types of containers: the value container and the existential container. The existential container stays like it is. The value container is as big as it needs to be to store the value of the given type, for small values (at most as big as the value buffer). If the value is bigger than the value buffer (or has more than one association to a reference type) the value container for this type is only as big as a reference, because these type will then stored on the heap with CoW **always**. This way I can always assign a value to a variable typed with a protocol, since value (or reference) will fit into the value buffer of the existential container. Additionally, CoW is available automatically for all types for which it „makes sense“ (of course annotations should be available to turn to the current „behavior“ if someone does not like this automatism. Last but not least, using the *unique value pool* for all value types, that fall into the category CoW-abonga this will be very space efficient.

Of course, if you create a new value of such a CoW-type, you need an *atomic lookup and set operation* in the value pool first checking whether it is already there (therefore a good (default) implementation of equality and hashable is a prerequisite) and either use the available value or in the other case add the new value to the pool.

Such a value pool could even be used system-wide (some languages do this for Strings, Ints and other value types). These values have to be evicted if their reference count drops to `0`. For some values permanent storage or storage for some time even if they are currently not referenced like in a cache could be implemented in order to reduce heap allocations (e.g. Java does this for primitive type wrapper instances for boxing and unboxing).

I would really love this. It would affect ABI, so it is a (potential) candidate for Swift 4 Phase 1 right?

I know some Java VM implementations have attempted global uniquing of strings, but from what I've heard, nobody has done it in a way that's worth the performance and complexity tradeoffs.