Should we rename "class" when referring to protocol conformance?

dabrahams · May 7, 2016, 9:04pm

I've been thinking about this further and can now state my position more clearly
and concisely.

1. If we're going to have reference types with value semantics the boundary of
the value must extend through the reference to the value of the object. Two
instances may have the same logical value so reference equality is not good
enough.

My (radical) position has been that we should decree that if you really
want this thing to have value semantics, it should be a struct. That
is, wrap your reference type in a struct and provide an == that looks at
what's in the instance. This radically simplifies the model because we
can then assume that value types have value semantics and reference
types only have value semantics if you view their identitity as their
value.

2. Value types are not "pure" values if any part of the aggregate contains a
reference whose type does not have value semantics.

Then Array<Int> is not a “pure” value (the buffer contained in an
Array<Int> is a mutable reference type that on its own, definitely does
*not* have value semantics). I don't think this is what you intend, and
it indicates that you need to keep working on your definition.

Purity must include the entire aggregate. Array<UIView> has value
semantics but it is not a pure value.

In what sense does it have value semantics? Unless we can define
equality for Array<UIView> it's hard to make any claim about its value
semantics.

The primary reasons I can think of for creating reference types with value
semantics are avoiding copying everything all the time or using inheritance. (I
could also list pre-existing types here but am not as concerned with those)

One could argue that you can avoid copying by writing a struct with a handle and
one can simulate inheritance by embedding and forwarding. The problem is that
this involves a lot of boilerplate and makes your code more complex.

The “forwarding boilerplate problem” is something we need to solve in
the language regardless. The fact that we don't have an answer today
shouldn't prevent us from adopting the right model for values and
references.

For something like the standard library these concerns are far
outweighed by the benefit we all gain by having our collections be
value types. However, in application code the benefit may not be worth
the cost thus it may be reasonable to prefer immutable objects.

I think there is a viable path for enhancing the language such that there is
little or not reason to implement a value semantic type as a reference type. If
we were able to declare value types as "indirect" and / or have a compiler
supported Box (probably with syntactic sugar) that automatically forwarded
calls, performed CoW, etc this would allow us much more control over copying
without requiring boilerplate. We could also add something along the lines of
Go's embedding (or a more general forwarding mechanism which is my preference)
which would likely address many of the reasons for using inheritance in a value
semantic reference type.

If we do go down that path I think the case that value semantic types should be
implemented as value types, thus reference equality should be the default
equality for reference types gets much stronger. In that hypothetical future
Swift we might even be able to go so far as saying that reference types with
value semantics are an anti-pattern and "outlaw" them. This would allow us to
simply say "reference types have reference semantics".

We might also be able to get to a place where we can "outlaw" value types that
do not have value semantics. I haven't thought deeply about that so I'm not
certain of the implications, particularly with regards to C interop. IIRC Dave A
indicated he would like to see this happen. If this is possible, we may
eventually have a language where "value types have value semantics", "some value
types are pure values", and "reference types have reference semantics and are
never pure values". If it is achievable it would be a significant step forward
in simplicity and clarity.

So far, I still don't believe that introducing a “pure values” distinction is
adding simplicity and clarity. To me it looks like a needless wrinkle.

···

on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

Matthew

Sent from my iPad

On May 7, 2016, at 11:17 AM, Matthew Johnson <matthew@anandabits.com> wrote:

    Sent from my iPad

    On May 7, 2016, at 2:21 AM, Andrew Trick via swift-evolution > <swift-evolution@swift.org> wrote:

            On May 6, 2016, at 5:48 PM, Dave Abrahams via swift-evolution > <swift-evolution@swift.org> wrote:

                        I don’t mean to imply that it is the *only* valuable
                property. However, it I (and many others) do believe it is an
                extremely valuable
                property in many cases. Do you disagree?

            I think I do. What is valuable about such a protocol? What generic
            algorithms could you write that work on models of PureValue but
            don't
            work just as well on Array<Int>?

        class Storage {
        var element: Int = 0
        }

        struct Value {
        var storage: Storage
        }

        func amIPure(v: Value) -> Int {
        v.storage.element = 3
        return v.storage.element
        }

        I (the optimizer) want to know if 'amIPure' is a pure function. The
        developer needs to tell me where the boundaries of the value lie. Does
        'storage' lie inside the Value, or outside? If it is inside, then Value
        is a 'PureValue' and 'amIPure' is a pure function. To enforce that, the
        developer will need to implement CoW, or we need add some language
        features.

    Thank you for this clear exposition of how PureValue relates to pure
    functions. This is the exact intuition I have about it but you have stated
    it much more clearly.

    Language features to help automate CoW would be great. It would eliminate
    boilerplate, but more importantly it would likely provide more information
    to the compiler.

        If I know about every operation inside 'amIPure', and know where the
        value's boundary is, then I don't really need to know that 'Value' is a
        'PureValue'. For example, I know that this function is pure without
        caring about 'PureValue'.

        func IAmPure(v: Value, s: Storage) -> Int {
        var t = v
        t.storage = s
        return t.storage.element
        }

        However, I might only have summary information. I might know that the
        function only writes to memory reachable from Value. In that case, it
        would be nice to have summary information about the storage type.
        'PureValue' is another way of saying that it does not contain references
        to objects outside the value's boundary (I would add that it cannot have
        a user-defined deinit). The only thing vague about that is that we don't
        have a general way for the developer to define the value's boundary. It
        certainly should be consistent with '==', but implementing '==' doesn't
        tell the optimizer anything.

    I think the ability to define the value's boundary would be wonderful. If we
    added a way to do this it would be a requirement of PureValue.

        Anyway, these are only optimizer concerns, and programming model should
        take precedence in these discussion. But I thought that might help.

        -Andy

        _______________________________________________
        swift-evolution mailing list
        swift-evolution@swift.org
        https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave

cloutiertyler · May 7, 2016, 10:08pm

       Swift’s collections also accomplish this through copying, but only when
       the
       elements they contain also have the same property.

   Only if you think mutable class instances are part of the value of the
   array that stores references to those class instances. As I said
   earlier, you can *take* that point of view, but why would you want to?
   Today, we have left that question wide open, which makes the whole
   notion of what is a logical value very indistinct. I am proposing to
   close it.

       On the other hand, it is immediately obvious that non-local mutation
       is quite possibly in the elements of a Swift Array<AnyObject> unless
       they are all uniquely referenced.

   If you interpret the elements of the array as being *references* to
   objects, there is no possibility of non-local mutation. If you
   interpret the elements as being objects *themselves*, then you've got
   problems.

This does not make sense, because you’ve got problems either way. You are
arguing, essentially, that everything is a value type because
references/pointers are a value.

I am arguing that every type can be viewed as a value, allowing us to
preserve a sense in which Array<T> has value semantics irrespective of
the details of T.

If that were the case then the *only* valid way to compare the
equality of types would be to compare their values. Overriding the
equality operator would inherently violate the property of
immutability, i.e. two immutable objects can change their equality
even without mutation of their “values".

Not at all. In my world, you can override equality such that it
includes referenced storage when either:

1. the referenced storage will not be mutated
2. or, the referenced storage will only mutated when uniquely-referenced.

func ==(lhs, rhs) {
...
}

class MyClass {
var a: Int
...

}

let x = MyClass(a: 5)
let y = MyClass(a: 5)

x == y // true
y.a = 6
x == y // false

I don't understand what point you're trying to make, here. I see that x
and y are immutable. Notwithstanding the fact that the language tries to
hide the difference between a reference and the instance to which it
refers from the user (the difference would be clearer if you had to
write y->a = 6 as in C, but it's still there), that immutability doesn't
extend beyond the variable binding. The class instance to which y
refers, as you've ably demonstrated, is mutable.

The point I’m trying to make is that in the above code, I am able to violate rule 1 of your world insofar as I am including referenced storage in my definition of equality which can be mutated even though my reference is immutable.

   Are you arguing that reference types should be equatable by default,
   using
   equality of the reference if the type does not provide a custom
   definition of
   equality?

   Yes!!

Custom definitions of equality, inherently, decouple immutability from
equality,

Not a bit. They certainly *can* do that, if we allow it, but I am
proposing to ban that. There are still useful custom definitions of
equality as I have outlined above.

If you’re proposing to ban that, then I may have misunderstood your position. I think we are in agreement on that, however… (more below)

as shown above. Swift makes it appear as though references and values
are on the same level in a way that C does not.

Yep. It's an illusion that breaks down at the edges and can be really
problematic if users fully embrace it. You can't write a generic
algorithm with well-defined semantics that does mutation-by-part on
instances of T without constraining T to have value or reference semantics.

I am not advocating that we require “y->a” for class member access, but
I *am* suggesting that we should accept the fact that reference and
value semantics are fundamentally different and make design choices
(including language rules) accordingly.

let x = MyStruct()
let y = MyClass()

x.myFoo
y.myFoo

vs

my_struct *x = …
my_struct y = …

x.my_foo
y->my_foo

With C it is explicit that you are crossing a reference. Thus there is only
*one* type of equality in C, that the values *are equal*.

Well, C doesn't even *have* struct equality;
How do you compare structs for equality in C? - Stack Overflow

This exactly the type of equality you are referring to, but this does
not carry over to Swift for precisely the reason that Swift paves over
the difference between value and reference types, and then allows you
to redefine equality.

Therefore, in essentially no circumstances does it make sense to
compare a type by its reference if it has any associated data in
Swift.

Disagreed. Instances whose *identity* is significant, i.e. basically
everything that actually ought to be a class, can be very usefully
compared by their references. For example, if equality and hashing were
defined for UIViews, based on their references, you could use a
Set<UIView> to keep track of which views had user interaction during a
given time interval.

This use case is exploiting the fact that the reference is a unique identifier for a view. For any distributed application this is no longer true for objects. Equality should be used to uniquely define data.

In a non-distributed application comparing references is also an implicit comparison of the entire object graph referenced by that reference. When you allow any other definition of equality for reference types, unless that comparison explicitly includes all values in the each of the object’s referenced object graphs, it is only *partial* equality. Thus custom equality is a lie that should probably be expressed with something more like ~=. It’s only equal up to the boundary, which is arbitrarily defined.

So yes, I agree that at least equality should be consistent with immutability, but in my opinion the only way to accomplish that is to ban custom equality.

I’m of the opinion that there are only two ways of accomplishing this.

EITHER

One could imagine a definition of equality that did explore the entire object graph comparing values (only using references to find other values, not for comparison) as it went. However, this this would not be able to align with the semantics of immutability (maybe by only allowing a single entry point into the graph which was guaranteed to be a unique reference?).

OR

=== should be the only valid equality operator for classes (and you’re right it should be spelled ==), and that if you want to compare classes you should just put all of the data that acts as the “identity” of that class in a value type which can be compared by value. Value types could then have generated equality operators based on the equality of each of their constituent values, some of which could be references (but as I mentioned including references in the identity does not work for distributed applications).

let x = MyClass(…)
let y = MyClass(…)

x.identityStruct == y.identityStruct

As it stands now in Swift, a class is more than just a reference. It also includes all sorts of assumptions about it’s associated data based on the fact that a class “pretends” to include the data it’s associated with. Hence the need for custom equality operators.

I really think we’re on the same page here, probably. Or at least in the same book.

There are still useful custom definitions of
equality as I have outlined above.

I think I am missing these. Could you provide an example?

Basically, if it will be commonplace to override the equality operator
to compare the first level of associated values of a reference type,
then the comparison of just the reference has no business being the
default.

If the default equality for reference types was defined as the equality of the
references it would be inconsistent with the Swift’s current apparent surfacing
of the first level of associated data for reference types.

Yes, but as I've said, that illusion doesn't work in the presence of
mutability.

       I think perhaps what you mean by “purity” is just, “has value
       semantics.” But I could be wrong.

       No, an array storing instances of reference types that are not immutable
       would
       not be “pure” (or whatever you want to call it).

       is derived from deep value semantics. This is when there is no
       possibility of shared mutable state. This is an extremely important
       property.

       It's the wrong property, IMO.

       Wrong in what sense?

       Wrong in the sense that it rules out using things like Array that are
       logically value types but happen to be implemented with CoW, and if you
       have proper encapsulation there's no way for these types to behave as
       anything other than values, so it would be extremely limiting.

       I’m a big fan of CoW as an implementation detail. We have definitely
       been
       miscommunicating if you thought I was suggesting something that would
       prohibit
       CoW.

   Then what, precisely, are the syntactic and semantic requirements of
   “PureValue?”

I assume what is meant by "PureValue", is any object A, whose own references
form a subgraph, within which a change to any of the values would constitute a
change in the value of A (thus impermissible if A is immutable). Thus structs
would quality as “PureValues”.

OK, one vote for that interpretation noted.

I also assume that enforcing immutability on an object graph, via CoW
or otherwise, would be unfeasible.

I presume by “enforcing” you mean, “enforcing by the compiler.” It's
very easy to enforce that for particular object graphs in library code,
using encapsulation.

Yes, I do mean by the compiler, but you are right you can enforce this by hiding whatever you’d like behind inaccessible values.

···

On May 7, 2016, at 12:52 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Fri May 06 2016, Tyler Fleming Cloutier <cloutiertyler-AT-aol.com> wrote:

   On May 6, 2016, at 6:54 PM, Dave Abrahams via swift-evolution >> <swift-evolution@swift.org> wrote:
   on Fri May 06 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
       On May 6, 2016, at 7:48 PM, Dave Abrahams via swift-evolution >> <swift-evolution@swift.org> wrote:

You could enforce it on all values accessible by traversing a single
reference for reference types, however.

This is why I don’t really buy the argument that there is no such this
as deep vs shallow copy. Deep copy means copying the whole “PureValue”
or subgraph, shallow copy means traversing a single reference and
copying all accessible values.

Well, again, “you can look at the world that way, but why would you want
to?” It makes reasoning about code exponentially more difficult if at
every level you have to ask whether a copy is deep or shallow.

--
-Dave

anandabits · May 8, 2016, 4:11am

Sent from my iPad

      Not sure what to think about the enum cases inside a
      protocol (if AnyEnum would
      even exist), it could be a nice addition to the language, but
      this is an own
      proposal I guess.

      We should start by adding AnyValue protocol to which all value
      types
      conforms.

      Having a way to constrain conformance to things with value semantics
      is
      something I've long wanted. *However*, the approach described is too
      simplistic. It's possible to build classes whose instances have
      value
      semantics (just make them immutable) and it's possible to build
      structs
      whose instances have reference semantics (just put the struct's
      storage
      in a mutable class instance that it holds as a property, and don't
      do
      copy-on-write).

      In order for something like AnyValue to have meaning, we need to
      impose
      greater order. After thinking through many approaches over the
      years, I
      have arrived at the (admittedly rather drastic) opinion that the
      language should effectively outlaw the creation of structs and enums
      that don't have value semantics. (I have no problem with the idea
      that
      immutable classes that want to act as values should be wrapped in a
      struct). The language could then do lots of things much more
      intelligently, such as correctly generating implementations of
      equality.

      That is a drastic solution indeed! How would this impact things like
      Array<UIView>? While Array itself has value semantics, the aggregate
      obviously does not as it contains references which usually be mutated
      underneath us.

      Value semantics and mutation can only be measured with respect to
      equality. The definition of == for all class types would be equivalent
      to ===. Problem solved.

      Similar considerations apply to simpler wrapper structs such as Weak.

      Same answer.

      Hmm. If those qualify as “value semantic” then what kind of structs and
      enums
      would not? A struct wrapping a mutable reference type certainly doesn’t
      “feel”
      value semantic to me and certainly doesn’t have the guarantees usually
      associated with value semantics (won’t mutate behind your back, thread
      safe,
      etc).

      Sure it does.

      public struct Wrap<T: AnyObject> : Equatable {
      init(_ x: T) { self.x = x }
      private x: T
      }

      func == <T>(lhs: Wrap<T>, rhs: Wrap<T>) -> Bool {
      return lhs.x === rhs.x
      }

      I defy you to find any scenario where Wrap<T> doesn't have value
      semantics, whether T is mutable or not.

      Alternately, you can look at the Array implementation. Array is a
      struct wrapping a mutable class. It has value semantics by virtue of
      CoW.

      This goes back to where you draw the line as to the “boundary of the
      value”.
      Wrap and Array are “value semantic” in a shallow sense and are capable
      of deep
      value semantics when T is deeply value semantic.

      No, I'm sorry; this “deep-vs-shallow” thing is a fallacy that comes from
      not understanding the boundaries of your value. Or, put more
      solicitously: sure, you can look at the world that way, but it just
      makes everything prohibitively complicated, so why would you want to?

      In my world, there's no such thing as a “deep copy” or a “shallow copy;”
      there's just “copy,” which logically creates an independent version of
      everything up to the boundaries of the value. Likewise, there's no
      “deep value semantics” or “shallow value semantics.”

      Equality defines
      value semantics, and the boundaries of an Array value always includes
      the values of its elements. The *only* problem here is that we have no
      way to do equality comparison on some arrays because some types aren't
      Equatable. IMO the costs of not having everything be equatable, in
      complexity-of-programming-model terms, are too high.

      Thank you for clarifying the terminology for me. This is helpful.

      I think I may have misunderstood what you meant by “boundary of the
      value”. Do
      you mean that the boundary of an Array value stops at the reference
      identity for
      elements with reference semantics?

  Yes.

      If you have an Array whose elements are of an immutable reference type
      that has value semantics would you say the boundary extends past the
      reference identity of an element and includes a definition of equality
      defined by that type?

  Yes!

      Are you arguing that reference types should be equatable by default,
      using
      equality of the reference if the type does not provide a custom
      definition of
      equality?

  Yes!!

      Both have their place, but the maximum benefit of value semantics
      (purity)

      I don't know what definition of purity you're using. The only one I
      know of applies to functions and implies no side effects. In that
      world, there is no mutation and value semantics is equivalent to
      reference semantics.

      I was using it in the sense of “PureValue” as discussed in this
      thread.

  Sorry, this is the first mention I can find in the whole thread, honest.
  Oh, it was a different thread. Joe describes it as a protocol for
  “types that represent fully self-contained values,” which is just fuzzy
  enough that everyone reading it can have his own interpretation of what
  it means.

      I was using it to mean values for which no *observable* mutation is
      possible (allowing for CoW, etc). Is there a better term for this than
      purity?

  You're still not making any sense to me. A type for which no observable
  mutation is possible is **immutable**. The “write” part of
  copy-on-write is a pretty clear indicator that it's all about
  **mutation**. I don't see how they're compatible.

Sorry, I did not write that very clearly. I should have said no observable
mutation *that happens behind your back*. In other words, the only *observable*
mutation possible is local.

Yeah, but you need to ask the question, “mutation in what?” The answer:
mutation in the value instance. Then you need to ask, “how do you
determine whether there was mutation?”

Immutability accomplishes this by simply prohibiting all
mutation. Primitive value types like Int and structs or enums that
only contain primitive value types accomplish this by getting copied
everywhere.

Swift’s collections also accomplish this through copying, but only when the
elements they contain also have the same property.

Only if you think mutable class instances are part of the value of the
array that stores references to those class instances. As I said
earlier, you can *take* that point of view, but why would you want to?
Today, we have left that question wide open, which makes the whole
notion of what is a logical value very indistinct. I am proposing to
close it.

I think part of the disconnect here might be the domains in which we
work. Maybe you're coming at this primarily from an algorithmic
perspective and I'm coming at it primarily from an app development
perspective.

IMO that's a false distinction. Suggestion: look up the definition of
“algorithm.” Your apps are built out of algorithms. FWIW, I was an app
developer long before I was a library writer. What I discovered, after
many years living with my own software and learning from mistakes, was
that “an algorithmic perspective” is essential to building any piece of
software that you or someone else might have to maintain, that users can
rely on, that doesn't have catastrophic performance problems, etc.

I know quite well what an algorithm is and I agree that apps *contain* algorithms. However they contain more than just algorithms. I am concerned here about the ability to be clear in my code about aggregate values which cannot be changed by code elsewhere in the app. I.e. creating conditions that prevent shared mutable state from being a possibility in various parts of the app.

For example, I think it is perfectly reasonable to write a generic
view controller that works with various data types and is initialized
with an Array<T> but only works properly when it isn't possible to
observe any mutation in the subgraph of T.

And my claim is that you have picked a really complicated way of saying
“T has value semantics,” or if there are differences in your intended
constraint, you don't actually care about those differences.

I disagree wholeheartedly. I am trying to say that T is a pure value, not that T simply has value semantics. This is the difference between Array<UIView> and Array<Int>.

Just taking the nontrivial case where T is a reference type, let's look
at the the phrase “it isn't possible to observe any mutation in the
subgraph of T.” This is still a rather fuzzy notion, but let me try to
nail it down. To me that means, if the behavior of “f” only depends on
data reachable through this array, and f makes no mutations, then in
this code, the two calls to f() are guaranteed have the same effect.

     func g<T>(a: [T]) {
       var vc = MyViewController(a)
       vc.f() // #1
       h()
       vc.f() // #2
    }

But clearly, the only way that can be the case is if T is actually
immutable (and contains no references to mutable data), because
otherwise anybody can write:

   class X { ... }
   let global: = [ X() ]
   func h() { global[0].mutatingMethod() }
   g(global)

Conclusion: your definition of PureValue, as written, implies conforming
reference types must be immutable. I'm not saying that's necessarily
what you meant, but if it isn't, you need to try to define it again.

Yes, my definition of PureValue would require conforming reference types to be immutable.

On the other hand, it is immediately obvious that non-local mutation
is quite possibly in the elements of a Swift Array<AnyObject> unless
they are all uniquely referenced.

If you interpret the elements of the array as being *references* to
objects, there is no possibility of non-local mutation. If you
interpret the elements as being objects *themselves*, then you've got
problems.

In application code we are concerned with the objects, not the
references.

Not necessarily, not at all. A Set<UIView> where you're interested in
the references is totally reasonable.

You are correct. I was thinking of model objects here and should have been more careful.

  I think perhaps what you mean by “purity” is just, “has value
  semantics.” But I could be wrong.

No, an array storing instances of reference types that are not immutable would
not be “pure” (or whatever you want to call it).

      is derived from deep value semantics. This is when there is no
      possibility of shared mutable state. This is an extremely important
      property.

      It's the wrong property, IMO.

      Wrong in what sense?

  Wrong in the sense that it rules out using things like Array that are
  logically value types but happen to be implemented with CoW, and if you
  have proper encapsulation there's no way for these types to behave as
  anything other than values, so it would be extremely limiting.

I’m a big fan of CoW as an implementation detail. We have definitely been
miscommunicating if you thought I was suggesting something that would prohibit
CoW.

Then what, precisely, are the syntactic and semantic requirements of “PureValue?”

I believe it is a purely semantic concept. It means that every name
binding is logically and observably distinct, including and objects in
the aggregate (if it includes references).

That sounds like “value semantics” to me, although I get the sense maybe
you're also adding the restriction that you're *not allowed* to define
the boundary of values as stopping at a reference but not including the
instance it references. IMO that restriction is not actually useful and
probably harmful.

I’m not saying you’re *not allowed* to define it as stopping there. I’m saying that it depends on the type. When you’re dealing with a reference type that has value semantics it should not stop there. If the type has reference semantics then I agree that it should stop at the reference.

This allows for local mutation on the same binding and also for un
observable mutation such as CoW in the implementation. But it does
not allow for a mutation applied to one name binding having the type
to be observed through another name binding having the type.

It sounds like you're trying to capture some notion of “can't possibly
reach shared mutable state through this instance,” but IMO there's a
false distinction here.

That is not correct. I am trying to capture the notion of an aggregate for which all operations observing the state of any part of the aggregate will always return the same result at any point in time. This notion still allows for local mutation, which logically replaces the entire aggregate with a new aggregate. This is a recursive property built from the leaves of the aggregate up. As long as I am composing types that inherently have this property it will be preserved.

It possible to preserve this property while making use of types that do not have this property as in the case of Array<Int>, Box<Int>, etc. The examples I can think of here are all generic, value semantic types who preserve this property when Element has it, but obviously do not introduce the property when Element does not have it.

Fundamentally, there's no difference between a
reference to an object and an integer that can be used as an index into
a global array that contains a reference to the object, or even an
integer that can be used as an index into a global array that contains
an equivalent struct.

I understand what you’re saying here but I could’t disagree more. There is a huge difference between the “value” of an integer and the “value” of a reference. This difference is exhibited by the fact that we can perform arithmetic on integers and we can’t on references. Just because you can *use* an integer to do something doesn’t mean that it’s *value* is intricately related to that use. The *value* of a reference is intricately related to the operation of following the reference whereas the *value* of an integer is not.

Again, I would like to see some piece of code that *actually depends on
this PureValue property for its correctness*.

I think Andrew gave a good example. If I am writing a function that I intend to be a pure function my inputs must be “pure” (referentially transparent). Identical inputs receive identical outputs. Maybe I wish to memoize the result. My function will not work correctly if it is supplied with Array<UIView>. There is no guarantee that any state of the UIViews it inspects will be the same across function calls even when passed the exact same array in close succession.

      I don’t mean to imply that it is the *only* valuable
      property. However, it I (and many others) do believe it is an extremely
      valuable
      property in many cases. Do you disagree?

  I think I do. What is valuable about such a protocol? What generic
  algorithms could you write that work on models of PureValue but don't
  work just as well on Array<Int>?

Array<Int> provides the semantics I have in mind just fine so there wouldn’t be
any. Array<AnyObject> is a completely different story. With
Array<AnyObject> you cannot rely on a guarantee the objects contained
in the array will not be mutated by code elsewhere that also happens
to have a reference to the same objects.

Okay then, what algorithms can you write that operate on PureValue that
don't work equally well on Array<AnyObject>?

I am not sure. It is possible that it does not apply to purely
algorithmic work. That does not mean it is unimportant.
It is quite valuable in application level code. It think it would be
valuable to reify it with a protocol rather than leaving it to
documentation even if the compiler can't always prove our code meets
this semantic.

For the purposes of library and language design, the ability to produce
use-cases (and solid definitions) is crucial. The ability to show how
it substantively differs from concepts we already have is crucial. If
we can't find these things, it doesn't belong.

I agree with you here. I will continue trying to make the distinction more clear and precise.

If you don't like the name PureValue for this concept lets bike shed.
I only used it because others had already used it. Maybe there is a
better name.

It's not the name that's the problem. I don't even understand what
you're reaching for, or why. Without a demonstration of what this is
for, I'm going to continue to argue against it (though I'm about to be
on vacation so I'll be out of your hair for a week).

Enjoy your vacation!

···

On May 7, 2016, at 3:48 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

On May 6, 2016, at 8:54 PM, Dave Abrahams <dabrahams@apple.com> wrote:

on Fri May 06 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
  On May 6, 2016, at 7:48 PM, Dave Abrahams via swift-evolution >>>> <swift-evolution@swift.org> wrote:
  on Thu May 05 2016, Matthew Johnson <swift-evolution@swift.org> wrote:
      On May 5, 2016, at 10:02 PM, Dave Abrahams >>>> <dabrahams@apple.com> wrote:
      on Thu May 05 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
      On May 5, 2016, at 4:59 PM, Dave Abrahams >>>> <dabrahams@apple.com> wrote:
      on Wed May 04 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
      On May 4, 2016, at 5:50 PM, Dave Abrahams via swift-evolution >>>> <swift-evolution@swift.org> wrote:
      on Wed May 04 2016, Matthew Johnson >>>> <swift-evolution@swift.org> wrote:
      On May 4, 2016, at 1:29 PM, Dave Abrahams via swift-evolution >>>> <swift-evolution@swift.org> wrote:
      on Wed May 04 2016, Adrian Zubarev >>>> <swift-evolution@swift.org> >>>> wrote:

I have been trying to get you to nail down what you mean by PureValue,
and I was trying to illustrate that merely being “a struct wrapping a
mutable reference type” is not enough to disqualify anything from being
in the category you're trying to describe. What are the properties of
types in that category, and what generic code would depend on those
properties?

I hope my previous comments have helped to clarify this.

I'm afraid not yet.

--
-Dave

anandabits · May 8, 2016, 4:11am

Sent from my iPad

               I don’t mean to imply that it is the *only* valuable
           property. However, it I (and many others) do believe it is an
           extremely valuable
           property in many cases. Do you disagree?

       I think I do. What is valuable about such a protocol? What generic
       algorithms could you write that work on models of PureValue but don't
       work just as well on Array<Int>?

   class Storage {
   var element: Int = 0
   }

   struct Value {
   var storage: Storage
   }

   func amIPure(v: Value) -> Int {
   v.storage.element = 3
   return v.storage.element
   }

   I (the optimizer) want to know if 'amIPure' is a pure function. The
   developer needs to tell me where the boundaries of the value lie. Does
   'storage' lie inside the Value, or outside? If it is inside, then Value is a
   'PureValue' and 'amIPure' is a pure function. To enforce that, the developer
   will need to implement CoW, or we need add some language features.

Thank you for this clear exposition of how PureValue relates to pure functions.
This is the exact intuition I have about it but you have stated it much more
clearly.

Language features to help automate CoW would be great. It would eliminate
boilerplate, but more importantly it would likely provide more information to
the compiler.

Whoa; Andy never suggested this would help automate CoW. Are you
suggesting that? How would it work?

Quoting Andy:

"I (the optimizer) want to know if 'amIPure' is a pure function. The developer needs to tell me where the boundaries of the value lie. Does 'storage' lie inside the Value, or outside? If it is inside, then Value is a 'PureValue' and 'amIPure' is a pure function. To enforce that, the developer will need to implement CoW, or we need add some language features."

I was referring to new language features that eliminate the need for the developer to implement CoW manually while preserving the same semantics.

I don’t know about the general case, but in simple cases I can imagine a feature such as “indirect struct” or Box<T: ValueType> which would contain a reference to a struct on the heap. Any time a mutating operation was performed on a non-uniquely referenced struct it would be copied first and the internal reference updated to point to the new copy on the heap. This is the kind of thing I had in mind when I said “automating CoW”.

···

On May 7, 2016, at 3:53 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

On May 7, 2016, at 2:21 AM, Andrew Trick via swift-evolution >> <swift-evolution@swift.org> wrote:
On May 6, 2016, at 5:48 PM, Dave Abrahams via swift-evolution >> <swift-evolution@swift.org> wrote:

--
-Dave

Andrew_Trick · May 8, 2016, 12:07am

It sounds like you’re changing the definition of value semantics to make it impossible to define PureValue. Does Array<T> have value semantics then only if T also has value semantics?

The claim has been made that Array always has value semantics, implying that the array value’s boundary ends at the boundary of it’s element values. That fact is what allows the compiler to ignore mutation of the buffer.

It's perfectly clear that Array<T> is a PureValue iff T is a PureValue. PureValue is nothing more than transitive value semantics.

At any rate, we could add a PureValue magic protocol, and it would have well-defined meaning. I'm not sure that it is worthwhile or even a good way to approach the problem. But we don't need to argue about the definition.

-Andy

···

On May 7, 2016, at 2:04 PM, Dave Abrahams <dabrahams@apple.com> wrote:

2. Value types are not "pure" values if any part of the aggregate contains a
reference whose type does not have value semantics.

Then Array<Int> is not a “pure” value (the buffer contained in an
Array<Int> is a mutable reference type that on its own, definitely does
*not* have value semantics). I don't think this is what you intend, and
it indicates that you need to keep working on your definition.

anandabits · May 8, 2016, 1:43am

This depends on the type. For types representing resources, etc it works just
fine. But for models it does not work unless the model subgraph is entirely
immutable and instances are unique.
I agree that it isn't a good idea to provide a default that will
certainly be wrong in many cases.

Please show an example of a mutable model where such an equality would
be wrong.

This is somewhat orthogonal to the main points I have been making in this thread. I have been focused on discussion about reference types that have value semantics and the distinction between value semantics and pure values. In any case, here you go:

let a: NSMutableArray = [1, 2, 3]
let other: NSMutableArray = [1, 2, 3]
let same = a === other // false
let equal = a == other // true

Reference equality does not match the behavior of many existing mutable model types. You seem to be making a case that in Swift it should. But that is a separate discussion from the one I am trying to engage in because mutable reference types *do not* have value semantics.

   I assume what is meant by "PureValue", is any object A, whose own references
   form a subgraph, within which a change to any of the values would constitute
   a change in the value of A (thus impermissible if A is immutable). Thus
   structs would quality as “PureValues”.

As you noted in a followup, not all structs qualify. Structs that whose members
all qualify will qualify. References to a subgraph that doesn't allow for any
observable mutation (i.e. deeply immutable reference types) also qualify.

This means the following qualify:

* primitive structs and enums
* observable immutable object subgraphs
* any type composed from the previous

It follows that generic types often conditionally qualify depending on their
type arguments.

   I also assume that enforcing immutability on an object graph, via CoW or
   otherwise, would be unfeasible. You could enforce it on all values
   accessible by traversing a single reference for reference types, however.

   This is why I don’t really buy the argument that there is no such this as
   deep vs shallow copy. Deep copy means copying the whole “PureValue” or
   subgraph, shallow copy means traversing a single reference and copying all
   accessible values.

               I don’t mean to imply that it is the *only* valuable
           property. However, it I (and many others) do believe it is an
           extremely
           valuable
           property in many cases. Do you disagree?

           I think I do. What is valuable about such a protocol? What generic
           algorithms could you write that work on models of PureValue but
           don't
           work just as well on Array<Int>?

           Array<Int> provides the semantics I have in mind just fine so there
           wouldn’t be
           any. Array<AnyObject> is a completely different story. With
           Array<AnyObject> you cannot rely on a guarantee the objects
           contained
           in the array will not be mutated by code elsewhere that also happens
           to have a reference to the same objects.

       Okay then, what algorithms can you write that operate on PureValue that
       don't work equally well on Array<AnyObject>?

You haven't answered this question. How would you use this protocol?

I answered elsewhere but I’ll repeat that one use that immediately comes to mind is to constrain values received in the initializer of a (view) controller to ensure that the observable state will not change over time. This is not an algorithmic use but is still perfectly valid IMO.

If I read Andrew’s post correctly it sounds like it may also be of use to the optimizer in some cases.

···

On May 7, 2016, at 3:03 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

           let t = MyClass()
           foo.acceptWrapped(Wrap(t))
           t.mutate()

           In this example, foo had better not depend on the wrapped instance
           not
           getting
           mutated.

           foo has no way to get at the wrapped instance, so it can't depend on
           anything about it.

           Ok, but this is a toy example. What is the purpose of Wrap? Maybe
           foo
           passes the
           wrapped instance back to code that *does* have visibility to the
           instance. My
           point was that shared mutable state is still possible here.

           And my point is that Wrap<T> encapsulates a T (almost—I should have
           let
           it construct the T in its init rather than accepting a T parameter)
           and
           the fact that it's *possible* to code something with the structure
           of
           Wrap so that it has shared mutable state is irrelevant.

           The point I am trying to make is that the semantic properties of
           Wrap<T> depend
           on the semantic properties of T (whether or not non-local mutation
           may be
           observed in this case).

       No they do not; Wrap<T> was specifically designed *not* to depend on the
       semantic properties of T. This was in answer to what you said:

               A struct wrapping a mutable reference type certainly doesn’t
           “feel” value semantic to me and certainly doesn’t have the
           guarantees usually associated with value semantics (won’t
           mutate behind your back, thread safe, etc).

       I have been trying to get you to nail down what you mean by PureValue,
       and I was trying to illustrate that merely being “a struct wrapping a
       mutable reference type” is not enough to disqualify anything from being
       in the category you're trying to describe. What are the properties of
       types in that category, and what generic code would depend on those
       properties?

Again, the key questions are above, asked a different way.

--
-Dave

dabrahams · May 8, 2016, 5:39am

       Swift’s collections also accomplish this through copying, but only when
       the
       elements they contain also have the same property.

   Only if you think mutable class instances are part of the value of the
   array that stores references to those class instances. As I said
   earlier, you can *take* that point of view, but why would you want to?
   Today, we have left that question wide open, which makes the whole
   notion of what is a logical value very indistinct. I am proposing to
   close it.

       On the other hand, it is immediately obvious that non-local mutation
       is quite possibly in the elements of a Swift Array<AnyObject> unless
       they are all uniquely referenced.

   If you interpret the elements of the array as being *references* to
   objects, there is no possibility of non-local mutation. If you
   interpret the elements as being objects *themselves*, then you've got
   problems.

This does not make sense, because you’ve got problems either way. You are
arguing, essentially, that everything is a value type because
references/pointers are a value.

I am arguing that every type can be viewed as a value, allowing us to
preserve a sense in which Array<T> has value semantics irrespective of
the details of T.

If that were the case then the *only* valid way to compare the
equality of types would be to compare their values. Overriding the
equality operator would inherently violate the property of
immutability, i.e. two immutable objects can change their equality
even without mutation of their “values".

Not at all. In my world, you can override equality such that it
includes referenced storage when either:

1. the referenced storage will not be mutated
2. or, the referenced storage will only mutated when uniquely-referenced.

func ==(lhs, rhs) {
...
}

class MyClass {
var a: Int
...

}

let x = MyClass(a: 5)
let y = MyClass(a: 5)

x == y // true
y.a = 6
x == y // false

I don't understand what point you're trying to make, here. I see that x
and y are immutable. Notwithstanding the fact that the language tries to
hide the difference between a reference and the instance to which it
refers from the user (the difference would be clearer if you had to
write y->a = 6 as in C, but it's still there), that immutability doesn't
extend beyond the variable binding. The class instance to which y
refers, as you've ably demonstrated, is mutable.

The point I’m trying to make is that in the above code, I am able to
violate rule 1 of your world insofar as I am including referenced
storage in my definition of equality which can be mutated even though
my reference is immutable.

Sorry, I guess I don't understand what difference it makes that it's
possible to write code that violates my rules. It's not news to me, as
I'm sure you knew when you posted it.

   Are you arguing that reference types should be equatable by default,
   using
   equality of the reference if the type does not provide a custom
   definition of
   equality?

   Yes!!

Custom definitions of equality, inherently, decouple immutability from
equality,

Not a bit. They certainly *can* do that, if we allow it, but I am
proposing to ban that. There are still useful custom definitions of
equality as I have outlined above.

If you’re proposing to ban that, then I may have misunderstood your
position. I think we are in agreement on that, however… (more below)

as shown above. Swift makes it appear as though references and values
are on the same level in a way that C does not.

Yep. It's an illusion that breaks down at the edges and can be really
problematic if users fully embrace it. You can't write a generic
algorithm with well-defined semantics that does mutation-by-part on
instances of T without constraining T to have value or reference semantics.

I am not advocating that we require “y->a” for class member access, but
I *am* suggesting that we should accept the fact that reference and
value semantics are fundamentally different and make design choices
(including language rules) accordingly.

let x = MyStruct()
let y = MyClass()

x.myFoo
y.myFoo

vs

my_struct *x = …
my_struct y = …

x.my_foo
y->my_foo

With C it is explicit that you are crossing a reference. Thus there is only
*one* type of equality in C, that the values *are equal*.

Well, C doesn't even *have* struct equality;
How do you compare structs for equality in C? - Stack Overflow

This exactly the type of equality you are referring to, but this does
not carry over to Swift for precisely the reason that Swift paves over
the difference between value and reference types, and then allows you
to redefine equality.

Therefore, in essentially no circumstances does it make sense to
compare a type by its reference if it has any associated data in
Swift.

Disagreed. Instances whose *identity* is significant, i.e. basically
everything that actually ought to be a class, can be very usefully
compared by their references. For example, if equality and hashing were
defined for UIViews, based on their references, you could use a
Set<UIView> to keep track of which views had user interaction during a
given time interval.

This use case is exploiting the fact that the reference is a unique
identifier for a view.

That's a fundamental property of class instances.

For any distributed application this is no longer true for objects.

That certainly depends on your programming model for distributed
applications. If you want to try to maintain the illusion that there's
really only one object when you have a pair communicating across a
process or machine boundary, or that a given object travels across
process or machine boundaries, *and* you want to build a Set that
“holds” objects that may live in other processes, then yes, you'll need
a different system.

Equality should be used to uniquely define data.

A mutable thing that has no identity apart from its value is a value
type. Don't use classes for that, or everything breaks down, because a
mutable class always eventually reveals that it's not a value.

In a non-distributed application comparing references is also an
implicit comparison of the entire object graph referenced by that
reference.

I don't think so. It's possible for references x and y to have exactly
isomorphic object graphs, but still x !=== y.

When you allow any other definition of equality for reference types,
unless that comparison explicitly includes all values in the each of
the object’s referenced object graphs, it is only *partial*
equality.

It still might not be equality. Equal things should be effectively
interchangeable (except for parts explicitly designated inessential,
such as an Array's capacity). As soon as you expose identity, that
falls apart.

Thus custom equality is a lie that should probably be expressed with
something more like ~=. It’s only equal up to the boundary, which is
arbitrarily defined.

So yes, I agree that at least equality should be consistent with
immutability, but in my opinion the only way to accomplish that is to
ban custom equality.

For all types, or for reference types? I'd be totally OK with banning
it for reference types. I'd disagree strongly with banning it for all
types.

I’m of the opinion that there are only two ways of accomplishing this.

EITHER

One could imagine a definition of equality that did explore the entire
object graph comparing values (only using references to find other
values, not for comparison) as it went. However, this this would not
be able to align with the semantics of immutability (maybe by only
allowing a single entry point into the graph which was guaranteed to
be a unique reference?).

OR

=== should be the only valid equality operator for classes (and you’re
right it should be spelled ==), and that if you want to compare
classes you should just put all of the data that acts as the
“identity” of that class in a value type which can be compared by
value. Value types could then have generated equality operators based
on the equality of each of their constituent values, some of which
could be references (but as I mentioned including references in the
identity does not work for distributed applications).

let x = MyClass(…)
let y = MyClass(…)

x.identityStruct == y.identityStruct

As it stands now in Swift, a class is more than just a reference. It
also includes all sorts of assumptions about it’s associated data
based on the fact that a class “pretends” to include the data it’s
associated with. Hence the need for custom equality operators.

I really think we’re on the same page here, probably. Or at least in
the same book.

Yes, although I don't understand a lot of what you're saying and my
instinct is that arguments about “whole object graph” are barking up the
wrong tree, or at least making it more complicated than necessary.

There are still useful custom definitions of
equality as I have outlined above.

I think I am missing these. Could you provide an example?

,----[ quoting myself ]

In my world, you can override equality such that it
includes referenced storage when either:

1. the referenced storage will not be mutated
2. or, the referenced storage will only mutated when uniquely-referenced.

`----

Array<T> has a custom equality operator; IMO that's indispensable.

···

on Sat May 07 2016, Tyler Fleming Cloutier <cloutiertyler-AT-aol.com> wrote:

On May 7, 2016, at 12:52 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Fri May 06 2016, Tyler Fleming Cloutier <cloutiertyler-AT-aol.com> wrote:

   On May 6, 2016, at 6:54 PM, Dave Abrahams via swift-evolution >>> <swift-evolution@swift.org> wrote:
   on Fri May 06 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
       On May 6, 2016, at 7:48 PM, Dave Abrahams via swift-evolution >>> <swift-evolution@swift.org> wrote:

--
-Dave

anandabits · May 8, 2016, 4:11am

This depends on the type. For types representing resources, etc it works just
fine. But for models it does not work unless the model subgraph is entirely
immutable and instances are unique.
I agree that it isn't a good idea to provide a default that will
certainly be wrong in many cases.

Please show an example of a mutable model where such an equality would
be wrong.

Maybe wrong is a little bit too strong a word, but it certainly isn’t the behavior people are accustomed to. I think most people consider model instances to be logically equal if their properties are equal regardless of the address of the instances in memory. Reference identity only works as expected if the model instances are uniqued in memory.

let a: NSMutableArray = [1, 2, 3]
let b: NSMutableArray = [1, 2, 3]

let referenceEquality = a === b // false
let elementEquality = a == b // true

   I assume what is meant by "PureValue", is any object A, whose own references
   form a subgraph, within which a change to any of the values would constitute
   a change in the value of A (thus impermissible if A is immutable). Thus
   structs would quality as “PureValues”.

As you noted in a followup, not all structs qualify. Structs that whose members
all qualify will qualify. References to a subgraph that doesn't allow for any
observable mutation (i.e. deeply immutable reference types) also qualify.

This means the following qualify:

* primitive structs and enums
* observable immutable object subgraphs
* any type composed from the previous

It follows that generic types often conditionally qualify depending on their
type arguments.

   I also assume that enforcing immutability on an object graph, via CoW or
   otherwise, would be unfeasible. You could enforce it on all values
   accessible by traversing a single reference for reference types, however.

   This is why I don’t really buy the argument that there is no such this as
   deep vs shallow copy. Deep copy means copying the whole “PureValue” or
   subgraph, shallow copy means traversing a single reference and copying all
   accessible values.

               I don’t mean to imply that it is the *only* valuable
           property. However, it I (and many others) do believe it is an
           extremely
           valuable
           property in many cases. Do you disagree?

           I think I do. What is valuable about such a protocol? What generic
           algorithms could you write that work on models of PureValue but
           don't
           work just as well on Array<Int>?

           Array<Int> provides the semantics I have in mind just fine so there
           wouldn’t be
           any. Array<AnyObject> is a completely different story. With
           Array<AnyObject> you cannot rely on a guarantee the objects
           contained
           in the array will not be mutated by code elsewhere that also happens
           to have a reference to the same objects.

       Okay then, what algorithms can you write that operate on PureValue that
       don't work equally well on Array<AnyObject>?

You haven't answered this question. How would you use this protocol?

I think the best example was given by Andy when discussing pure functions. Maybe I want to write a generic function and ensure it is pure. I can only do this if I know that any arguments received that compare equal will always present the same observable state. For example, maybe I wish to memoize the result.

I cannot write such a function for all T, and I also cannot write such a function for all T that have value semantics if we adopt the “references are values” view of the world. I need an additional constraint that rejects things like Array<UIView>. (T would obviously also be constrained by a protocol that exposes the properties or methods my function requires to compute its result)

In general, it would be used where you need to ensure that the result of any operation observing the state of any part of the aggregate value will always return the same value at any point in the future. If I observe a[0].foo now I know with certainty the result of observing a[0].foo at any point in the future. This aspect of preservation of observed values across time is essential to the distinction between Array<LayoutValue> (see below) and Array<UIView>. It doesn’t matter when I observe the frames of the elements of Array<LayoutValue>, I will always get the same rects back. With Array<UIView> that is obviously not the case as the frame of the view could be mutated by anyone with a reference to the views at any time in between my observations of the frame values.

struct LayoutValue {
frame: CGRect
}

···

On May 7, 2016, at 3:03 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

           let t = MyClass()
           foo.acceptWrapped(Wrap(t))
           t.mutate()

           In this example, foo had better not depend on the wrapped instance
           not
           getting
           mutated.

           foo has no way to get at the wrapped instance, so it can't depend on
           anything about it.

           Ok, but this is a toy example. What is the purpose of Wrap? Maybe
           foo
           passes the
           wrapped instance back to code that *does* have visibility to the
           instance. My
           point was that shared mutable state is still possible here.

           And my point is that Wrap<T> encapsulates a T (almost—I should have
           let
           it construct the T in its init rather than accepting a T parameter)
           and
           the fact that it's *possible* to code something with the structure
           of
           Wrap so that it has shared mutable state is irrelevant.

           The point I am trying to make is that the semantic properties of
           Wrap<T> depend
           on the semantic properties of T (whether or not non-local mutation
           may be
           observed in this case).

       No they do not; Wrap<T> was specifically designed *not* to depend on the
       semantic properties of T. This was in answer to what you said:

               A struct wrapping a mutable reference type certainly doesn’t
           “feel” value semantic to me and certainly doesn’t have the
           guarantees usually associated with value semantics (won’t
           mutate behind your back, thread safe, etc).

       I have been trying to get you to nail down what you mean by PureValue,
       and I was trying to illustrate that merely being “a struct wrapping a
       mutable reference type” is not enough to disqualify anything from being
       in the category you're trying to describe. What are the properties of
       types in that category, and what generic code would depend on those
       properties?

Again, the key questions are above, asked a different way.

--
-Dave

anandabits · May 8, 2016, 4:11am

I've been thinking about this further and can now state my position more clearly
and concisely.

1. If we're going to have reference types with value semantics the boundary of
the value must extend through the reference to the value of the object. Two
instances may have the same logical value so reference equality is not good
enough.

My (radical) position has been that we should decree that if you really
want this thing to have value semantics, it should be a struct. That
is, wrap your reference type in a struct and provide an == that looks at
what's in the instance. This radically simplifies the model because we
can then assume that value types have value semantics and reference
types only have value semantics if you view their identitity as their
value.

I agree with this longer term, but it is too soon for that.

Rather than suggest wrapping the reference in a struct I would suggest that most of the time just making it a struct in the first place is the right path. The problem with this is that it can lead to excessive copying, reference counting, etc if you’re not careful. I argue that mainstream developers should not need to bother with writing a reference type and wrapping it in a struct just to get around this. It would be nice if there were better, less boilerplate-y solutions to this in the future.

2. Value types are not "pure" values if any part of the aggregate contains a
reference whose type does not have value semantics.

Then Array<Int> is not a “pure” value (the buffer contained in an
Array<Int> is a mutable reference type that on its own, definitely does
*not* have value semantics). I don't think this is what you intend, and
it indicates that you need to keep working on your definition.

I have elaborated elsewhere as to why Array<Int> does meet my notion of “pure” value. I understand that it contains a buffer pointer, etc that does not have value semantics. But that is an implementation detail and is not externally observable. I believe that implementation strategies like this are extremely important. I am only concerned with the externally observable semantics and behavior of the type, not the implementation.

Just as the internal mutable reference type does not disqualify Array<Int> from having value semantics, it also does not disqualify it from being a “pure value".

Purity must include the entire aggregate. Array<UIView> has value
semantics but it is not a pure value.

In what sense does it have value semantics? Unless we can define
equality for Array<UIView> it's hard to make any claim about its value
semantics.

Well it should have value semantics using reference equality of the views because UIView has reference semantics so reference identity is the appropriate definition of equality. Isn’t that your position as well?

The primary reasons I can think of for creating reference types with value
semantics are avoiding copying everything all the time or using inheritance. (I
could also list pre-existing types here but am not as concerned with those)

One could argue that you can avoid copying by writing a struct with a handle and
one can simulate inheritance by embedding and forwarding. The problem is that
this involves a lot of boilerplate and makes your code more complex.

The “forwarding boilerplate problem” is something we need to solve in
the language regardless.

Yes I agree that it needs to be solved regardless. In fact, you might remember that I invested quite a bit of effort into drafting a proposal on the topic. I shelved it mostly because I became very busy with client work, but also partly due to the lukewarm reaction.

The fact that we don't have an answer today
shouldn't prevent us from adopting the right model for values and
references.

I think that depends on what you mean by this. If you mean providing a default equality of reference identity for reference types I disagree. I think that should wait until the language reaches a place where there is no good reason to write value semantic reference types. And I believe the boilerplate currently required to wrap them in a struct is sufficiently burdensome that this is not the case yet.

For something like the standard library these concerns are far
outweighed by the benefit we all gain by having our collections be
value types. However, in application code the benefit may not be worth
the cost thus it may be reasonable to prefer immutable objects.

I think there is a viable path for enhancing the language such that there is
little or not reason to implement a value semantic type as a reference type. If
we were able to declare value types as "indirect" and / or have a compiler
supported Box (probably with syntactic sugar) that automatically forwarded
calls, performed CoW, etc this would allow us much more control over copying
without requiring boilerplate. We could also add something along the lines of
Go's embedding (or a more general forwarding mechanism which is my preference)
which would likely address many of the reasons for using inheritance in a value
semantic reference type.

If we do go down that path I think the case that value semantic types should be
implemented as value types, thus reference equality should be the default
equality for reference types gets much stronger. In that hypothetical future
Swift we might even be able to go so far as saying that reference types with
value semantics are an anti-pattern and "outlaw" them. This would allow us to
simply say "reference types have reference semantics".

We might also be able to get to a place where we can "outlaw" value types that
do not have value semantics. I haven't thought deeply about that so I'm not
certain of the implications, particularly with regards to C interop. IIRC Dave A
indicated he would like to see this happen. If this is possible, we may
eventually have a language where "value types have value semantics", "some value
types are pure values", and "reference types have reference semantics and are
never pure values". If it is achievable it would be a significant step forward
in simplicity and clarity.

So far, I still don't believe that introducing a “pure values” distinction is
adding simplicity and clarity. To me it looks like a needless wrinkle.

Fair enough. I suspect that many folks who have been strongly influenced by functional programming may have a different opinion (btw, I don’t mean to imply anything about the degree to which functional programming has or has not influenced your opinion).

···

On May 7, 2016, at 4:04 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

Matthew

Sent from my iPad

On May 7, 2016, at 11:17 AM, Matthew Johnson <matthew@anandabits.com> wrote:

   Sent from my iPad

   On May 7, 2016, at 2:21 AM, Andrew Trick via swift-evolution >> <swift-evolution@swift.org> wrote:

           On May 6, 2016, at 5:48 PM, Dave Abrahams via swift-evolution >> <swift-evolution@swift.org> wrote:

                       I don’t mean to imply that it is the *only* valuable
               property. However, it I (and many others) do believe it is an
               extremely valuable
               property in many cases. Do you disagree?

           I think I do. What is valuable about such a protocol? What generic
           algorithms could you write that work on models of PureValue but
           don't
           work just as well on Array<Int>?

       class Storage {
       var element: Int = 0
       }

       struct Value {
       var storage: Storage
       }

       func amIPure(v: Value) -> Int {
       v.storage.element = 3
       return v.storage.element
       }

       I (the optimizer) want to know if 'amIPure' is a pure function. The
       developer needs to tell me where the boundaries of the value lie. Does
       'storage' lie inside the Value, or outside? If it is inside, then Value
       is a 'PureValue' and 'amIPure' is a pure function. To enforce that, the
       developer will need to implement CoW, or we need add some language
       features.

   Thank you for this clear exposition of how PureValue relates to pure
   functions. This is the exact intuition I have about it but you have stated
   it much more clearly.

   Language features to help automate CoW would be great. It would eliminate
   boilerplate, but more importantly it would likely provide more information
   to the compiler.

       If I know about every operation inside 'amIPure', and know where the
       value's boundary is, then I don't really need to know that 'Value' is a
       'PureValue'. For example, I know that this function is pure without
       caring about 'PureValue'.

       func IAmPure(v: Value, s: Storage) -> Int {
       var t = v
       t.storage = s
       return t.storage.element
       }

       However, I might only have summary information. I might know that the
       function only writes to memory reachable from Value. In that case, it
       would be nice to have summary information about the storage type.
       'PureValue' is another way of saying that it does not contain references
       to objects outside the value's boundary (I would add that it cannot have
       a user-defined deinit). The only thing vague about that is that we don't
       have a general way for the developer to define the value's boundary. It
       certainly should be consistent with '==', but implementing '==' doesn't
       tell the optimizer anything.

   I think the ability to define the value's boundary would be wonderful. If we
   added a way to do this it would be a requirement of PureValue.

       Anyway, these are only optimizer concerns, and programming model should
       take precedence in these discussion. But I thought that might help.

       -Andy

       _______________________________________________
       swift-evolution mailing list
       swift-evolution@swift.org
       https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave

dabrahams · May 8, 2016, 5:51am

This depends on the type. For types representing resources, etc it works just
fine. But for models it does not work unless the model subgraph is entirely
immutable and instances are unique.
I agree that it isn't a good idea to provide a default that will
certainly be wrong in many cases.

Please show an example of a mutable model where such an equality would
be wrong.

This is somewhat orthogonal to the main points I have been making in
this thread. I have been focused on discussion about reference types
that have value semantics and the distinction between value semantics
and pure values. In any case, here you go:

let a: NSMutableArray = [1, 2, 3]
let other: NSMutableArray = [1, 2, 3]
let same = a === other // false
let equal = a == other // true

That's not proof that an == for NSMutableArray that matches the behavior
of === would be wrong, just that it would be different from what we
currently have.

Reference equality does not match the behavior of many existing
mutable model types. You seem to be making a case that in Swift it
should.

Yes.

But that is a separate discussion from the one I am trying to engage
in because mutable reference types *do not* have value semantics.

Then maybe I should disengage here?

Okay then, what algorithms can you write that operate on PureValue that
don't work equally well on Array<AnyObject>?

You haven't answered this question. How would you use this protocol?

I answered elsewhere but I’ll repeat that one use that immediately
comes to mind is to constrain values received in the initializer of a
(view) controller to ensure that the observable state will not change
over time.

My claim is that substituting the constraint of “it has value
semantics,” while presumably looser than the PureValue constraint, would
not compromise the correctness of your view controller, so not only is
the meaning of PureValue hard to define, but it doesn't buy you
anything. If you want to refute that, just show me the code.

This is not an algorithmic use but is still perfectly valid IMO.

If the properties of PureValue matter to your view controller, there's
an algorithm somewhere that depends on those properties for its
correctness.

If I read Andrew’s post correctly it sounds like it may also be of use
to the optimizer in some cases.

FWIW, I'm only interested in how you use this protocol in the
programming model, and I'm not even sure Andrew is talking about the
same constraint that you are.

···

on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

On May 7, 2016, at 3:03 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

--
-Dave

dabrahams · May 8, 2016, 6:02am

You haven't answered this question. How would you use this protocol?

I think the best example was given by Andy when discussing pure
functions. Maybe I want to write a generic function and ensure it is
pure. I can only do this if I know that any arguments received that
compare equal will always present the same observable state.

And that it doesn't touch any globals.

For example, maybe I wish to memoize the result.

I cannot write such a function for all T, and I also cannot write such
a function for all T that have value semantics if we adopt the
“references are values” view of the world.

Oh, you absolutely can, because if the function applies to all T that
have value semantics, it only has a few operations to work with:
initialization, assignment, and equality. Assignment is the only
mutating one of these.

I need an additional constraint that rejects things like
Array<UIView>. (T would obviously also be constrained by a protocol
that exposes the properties or methods my function requires to compute
its result)

Did you just start referring to T as the element type of the array
instead of the function's parameter type? I think you're
unintentionally pulling a fast one, reasoning-wise. It might help to
write down some actual code.

In general, it would be used where you need to ensure that the result
of any operation observing the state of any part of the aggregate
value will always return the same value at any point in the future.
If I observe a[0].foo now I know with certainty the result of
observing a[0].foo at any point in the future.

Sure, but what you need then is a constraint on a's Element type that it
has value semantics, not some kind of new PureValue concept to use as a
constraint on the array itself.

···

on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

This aspect of preservation of observed values across time is
essential to the distinction between Array<LayoutValue> (see below)
and Array<UIView>. It doesn’t matter when I observe the frames of the
elements of Array<LayoutValue>, I will always get the same rects back.
With Array<UIView> that is obviously not the case as the frame of the
view could be mutated by anyone with a reference to the views at any
time in between my observations of the frame values.

struct LayoutValue {
  frame: CGRect
}

           let t = MyClass()
           foo.acceptWrapped(Wrap(t))
           t.mutate()

           In this example, foo had better not depend on the wrapped instance
           not
           getting
           mutated.

           foo has no way to get at the wrapped instance, so it can't depend on
           anything about it.

           Ok, but this is a toy example. What is the purpose of Wrap? Maybe
           foo
           passes the
           wrapped instance back to code that *does* have visibility to the
           instance. My
           point was that shared mutable state is still possible here.

           And my point is that Wrap<T> encapsulates a T (almost—I should have
           let
           it construct the T in its init rather than accepting a T parameter)
           and
           the fact that it's *possible* to code something with the structure
           of
           Wrap so that it has shared mutable state is irrelevant.

           The point I am trying to make is that the semantic properties of
           Wrap<T> depend
           on the semantic properties of T (whether or not non-local mutation
           may be
           observed in this case).

       No they do not; Wrap<T> was specifically designed *not* to depend on the
       semantic properties of T. This was in answer to what you said:

               A struct wrapping a mutable reference type certainly doesn’t
           “feel” value semantic to me and certainly doesn’t have the
           guarantees usually associated with value semantics (won’t
           mutate behind your back, thread safe, etc).

       I have been trying to get you to nail down what you mean by PureValue,
       and I was trying to illustrate that merely being “a struct wrapping a
       mutable reference type” is not enough to disqualify anything from being
       in the category you're trying to describe. What are the properties of
       types in that category, and what generic code would depend on those
       properties?

Again, the key questions are above, asked a different way.

--
-Dave

--
-Dave

dabrahams · May 8, 2016, 6:19am

        I've been thinking about this further and can now state my position more
        clearly
        and concisely.

        1. If we're going to have reference types with value semantics the
        boundary of
        the value must extend through the reference to the value of the object.
        Two
        instances may have the same logical value so reference equality is not
        good
        enough.

    My (radical) position has been that we should decree that if you really
    want this thing to have value semantics, it should be a struct. That
    is, wrap your reference type in a struct and provide an == that looks at
    what's in the instance. This radically simplifies the model because we
    can then assume that value types have value semantics and reference
    types only have value semantics if you view their identitity as their
    value.

I agree with this longer term, but it is too soon for that.

We don't have much longer to establish the programming model. It needs
to happen soon or it will be too late.

Rather than suggest wrapping the reference in a struct I would suggest that most
of the time just making it a struct in the first place is the right
path.

Well of course. But if you already have a reference type and aren't
ready to rewrite it, this is how you do it.

The problem with this is that it can lead to excessive copying,
reference counting, etc if you’re not careful. I argue that mainstream
developers should not need to bother with writing a reference type and
wrapping it in a struct just to get around this.

Sure, maybe our codegen could be smarter about this, but that shouldn't
hold back the programming model.

It would be nice if there were better, less boilerplate-y solutions to
this in the future.

        2. Value types are not "pure" values if any part of the aggregate
        contains a
        reference whose type does not have value semantics.

    Then Array<Int> is not a “pure” value (the buffer contained in an
    Array<Int> is a mutable reference type that on its own, definitely does
    *not* have value semantics). I don't think this is what you intend, and
    it indicates that you need to keep working on your definition.

I have elaborated elsewhere as to why Array<Int> does meet my notion of “pure”
value. I understand that it contains a buffer pointer, etc that does not have
value semantics. But that is an implementation detail and is not externally
observable. I believe that implementation strategies like this are extremely
important. I am only concerned with the externally observable semantics and
behavior of the type, not the implementation.

Just as the internal mutable reference type does not disqualify Array<Int> from
having value semantics, it also does not disqualify it from being a “pure
value".

As I've indicated, then, you need a different definition than the one
above. And you have to get the definition all together in one place so
it can be evaluated.

        Purity must include the entire aggregate. Array<UIView> has value
        semantics but it is not a pure value.

    In what sense does it have value semantics? Unless we can define
    equality for Array<UIView> it's hard to make any claim about its value
    semantics.

Well it should have value semantics using reference equality of the views
because UIView has reference semantics so reference identity is the appropriate
definition of equality. Isn’t that your position as well?

Yes.

        The primary reasons I can think of for creating reference types with
        value
        semantics are avoiding copying everything all the time or using
        inheritance. (I
        could also list pre-existing types here but am not as concerned with
        those)

        One could argue that you can avoid copying by writing a struct with a
        handle and
        one can simulate inheritance by embedding and forwarding. The problem is
        that
        this involves a lot of boilerplate and makes your code more complex.

    The “forwarding boilerplate problem” is something we need to solve in
    the language regardless.

Yes I agree that it needs to be solved regardless. In fact, you might remember
that I invested quite a bit of effort into drafting a proposal on the topic. I
shelved it mostly because I became very busy with client work, but also partly
due to the lukewarm reaction.

    The fact that we don't have an answer today
    shouldn't prevent us from adopting the right model for values and
    references.

I think that depends on what you mean by this. If you mean providing a default
equality of reference identity for reference types I disagree. I think that
should wait until the language reaches a place where there is no good reason to
write value semantic reference types. And I believe the boilerplate currently
required to wrap them in a struct is sufficiently burdensome that this is not
the case yet.

As I've said, we can't wait. We should make the change and use that to
drive development of the necessary features to reduce the burden of
writing optimized code.

Remember that the only value semantic reference types are immutable, so
the struct rendition of such types has only immutable properties.
Personally, I don't think that transforming

        struct X {
          ...
        private:
          let prop1: Type1
          let prop2: Type2
          let prop2: Type3
        }

into

        struct X {
           ...
        private:
          class Storage {
            let prop1: Type1
            let prop2: Type2
            let prop2: Type3
          }
          let value: Storage
        }

is so awful if you find you need to optimize away some reference
counting manually; you just need to add “.value” to property accesses in
X's methods, and this doesn't require any forwarding.

···

on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

On May 7, 2016, at 4:04 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

    So far, I still don't believe that introducing a “pure values”
    distinction is adding simplicity and clarity. To me it looks like
    a needless wrinkle.

Fair enough. I suspect that many folks who have been strongly influenced by
functional programming may have a different opinion (btw, I don’t mean to imply
anything about the degree to which functional programming has or has not
influenced your opinion).

--
-Dave

anandabits · May 8, 2016, 4:26am

2. Value types are not "pure" values if any part of the aggregate contains a
reference whose type does not have value semantics.

Then Array<Int> is not a “pure” value (the buffer contained in an
Array<Int> is a mutable reference type that on its own, definitely does
*not* have value semantics). I don't think this is what you intend, and
it indicates that you need to keep working on your definition.

It sounds like you’re changing the definition of value semantics to make it impossible to define PureValue. Does Array<T> have value semantics then only if T also has value semantics?

The claim has been made that Array always has value semantics, implying that the array value’s boundary ends at the boundary of it’s element values. That fact is what allows the compiler to ignore mutation of the buffer.

It's perfectly clear that Array<T> is a PureValue iff T is a PureValue. PureValue is nothing more than transitive value semantics.

At any rate, we could add a PureValue magic protocol, and it would have well-defined meaning. I'm not sure that it is worthwhile or even a good way to approach the problem. But we don't need to argue about the definition.

Thanks for jumping in again. I hope we can get past the discussion of definition!

Are you speaking specifically about this being of use to the optimizer or about the value of such a protocol in general?

For example, if we introduce a notion of pure functions into the language wouldn’t it be useful to be able to write generic pure functions by constraining the argument types to PureValue?

IMO this property is important enough that the ability to express it directly in code (rather than documentation) and to take advantage of it in generic code is very desirable. A PureValue protocol seems like a good way to do this but I am certainly open to other solutions as well.

Long term it would be really nice if Swift had a logically pure subset and the ability to clearly distinguish code that lives inside that world from code that is outside that world. I say “logically" pure because I think implementation techniques like CoW, memoization, etc are very valuable and do not violate the spirit of purity despite the fact that they rely on side effects.

-Matthew

···

On May 7, 2016, at 7:07 PM, Andrew Trick <atrick@apple.com> wrote:

On May 7, 2016, at 2:04 PM, Dave Abrahams <dabrahams@apple.com <mailto:dabrahams@apple.com>> wrote:

-Andy

dabrahams · May 8, 2016, 6:51am

        2. Value types are not "pure" values if any part of the aggregate
        contains a
        reference whose type does not have value semantics.

    Then Array<Int> is not a “pure” value (the buffer contained in an
    Array<Int> is a mutable reference type that on its own, definitely does
    *not* have value semantics). I don't think this is what you intend, and
    it indicates that you need to keep working on your definition.

It sounds like you’re changing the definition of value semantics to make it
impossible to define PureValue.

Not on purpose.

Does Array<T> have value semantics then only if T also has value
semantics?

This is a great question; I had to rewrite my response four times.

In my world, an Array<T> always has value semantics if you respect the
boundaries of element values as defined by ==. That means that if T is
a mutable reference type, you're not looking through references, because
== is equivalent to ===.

Therefore, for almost any interesting SomeConstraint that doesn't refine
ValueSemantics, then

Array<T: SomeConstraint>

only has value semantics if T has value semantics, since SomeConstraint
presumably uses aspects of T other than reference identity.

The claim has been made that Array always has value semantics,
implying that the array value’s boundary ends at the boundary of it’s
element values.

Yes, an array value ends at the boundary of its elements' values.

That fact is what allows the compiler to ignore mutation of the
buffer.

I don't know what you mean here.

It's perfectly clear that Array<T> is a PureValue iff T is a PureValue.
PureValue is nothing more than transitive value semantics.

You're almost there. “Transitive” implies that you are going to look at
the parts of a type to see if they are also PureValue's. So which parts
of the Array struct does one look at, and why? Just tell me the
procedure for determining whether a type is a PureValue.

At any rate, we could add a PureValue magic protocol, and it would have
well-defined meaning. I'm not sure that it is worthwhile or even a good way to
approach the problem. But we don't need to argue about the definition.

I don't want to argue about anything, really. I just want a definition.

···

on Sat May 07 2016, Andrew Trick <atrick-AT-apple.com> wrote:

On May 7, 2016, at 2:04 PM, Dave Abrahams <dabrahams@apple.com> wrote:

--
-Dave

cloutiertyler · May 8, 2016, 7:44am

      Swift’s collections also accomplish this through copying, but only when
      the
      elements they contain also have the same property.

  Only if you think mutable class instances are part of the value of the
  array that stores references to those class instances. As I said
  earlier, you can *take* that point of view, but why would you want to?
  Today, we have left that question wide open, which makes the whole
  notion of what is a logical value very indistinct. I am proposing to
  close it.

      On the other hand, it is immediately obvious that non-local mutation
      is quite possibly in the elements of a Swift Array<AnyObject> unless
      they are all uniquely referenced.

  If you interpret the elements of the array as being *references* to
  objects, there is no possibility of non-local mutation. If you
  interpret the elements as being objects *themselves*, then you've got
  problems.

This does not make sense, because you’ve got problems either way. You are
arguing, essentially, that everything is a value type because
references/pointers are a value.

I am arguing that every type can be viewed as a value, allowing us to
preserve a sense in which Array<T> has value semantics irrespective of
the details of T.

If that were the case then the *only* valid way to compare the
equality of types would be to compare their values. Overriding the
equality operator would inherently violate the property of
immutability, i.e. two immutable objects can change their equality
even without mutation of their “values".

Not at all. In my world, you can override equality such that it
includes referenced storage when either:

1. the referenced storage will not be mutated
2. or, the referenced storage will only mutated when uniquely-referenced.

func ==(lhs, rhs) {
...
}

class MyClass {
var a: Int
...

}

let x = MyClass(a: 5)
let y = MyClass(a: 5)

x == y // true
y.a = 6
x == y // false

I don't understand what point you're trying to make, here. I see that x
and y are immutable. Notwithstanding the fact that the language tries to
hide the difference between a reference and the instance to which it
refers from the user (the difference would be clearer if you had to
write y->a = 6 as in C, but it's still there), that immutability doesn't
extend beyond the variable binding. The class instance to which y
refers, as you've ably demonstrated, is mutable.

The point I’m trying to make is that in the above code, I am able to
violate rule 1 of your world insofar as I am including referenced
storage in my definition of equality which can be mutated even though
my reference is immutable.

Sorry, I guess I don't understand what difference it makes that it's
possible to write code that violates my rules. It's not news to me, as
I'm sure you knew when you posted it.

  Are you arguing that reference types should be equatable by default,
  using
  equality of the reference if the type does not provide a custom
  definition of
  equality?

  Yes!!

Custom definitions of equality, inherently, decouple immutability from
equality,

Not a bit. They certainly *can* do that, if we allow it, but I am
proposing to ban that. There are still useful custom definitions of
equality as I have outlined above.

If you’re proposing to ban that, then I may have misunderstood your
position. I think we are in agreement on that, however… (more below)

as shown above. Swift makes it appear as though references and values
are on the same level in a way that C does not.

Yep. It's an illusion that breaks down at the edges and can be really
problematic if users fully embrace it. You can't write a generic
algorithm with well-defined semantics that does mutation-by-part on
instances of T without constraining T to have value or reference semantics.

I am not advocating that we require “y->a” for class member access, but
I *am* suggesting that we should accept the fact that reference and
value semantics are fundamentally different and make design choices
(including language rules) accordingly.

let x = MyStruct()
let y = MyClass()

x.myFoo
y.myFoo

vs

my_struct *x = …
my_struct y = …

x.my_foo
y->my_foo

With C it is explicit that you are crossing a reference. Thus there is only
*one* type of equality in C, that the values *are equal*.

Well, C doesn't even *have* struct equality;
How do you compare structs for equality in C? - Stack Overflow

This exactly the type of equality you are referring to, but this does
not carry over to Swift for precisely the reason that Swift paves over
the difference between value and reference types, and then allows you
to redefine equality.

Therefore, in essentially no circumstances does it make sense to
compare a type by its reference if it has any associated data in
Swift.

Disagreed. Instances whose *identity* is significant, i.e. basically
everything that actually ought to be a class, can be very usefully
compared by their references. For example, if equality and hashing were
defined for UIViews, based on their references, you could use a
Set<UIView> to keep track of which views had user interaction during a
given time interval.

This use case is exploiting the fact that the reference is a unique
identifier for a view.

That's a fundamental property of class instances.

For any distributed application this is no longer true for objects.

That certainly depends on your programming model for distributed
applications. If you want to try to maintain the illusion that there's
really only one object when you have a pair communicating across a
process or machine boundary, or that a given object travels across
process or machine boundaries, *and* you want to build a Set that
“holds” objects that may live in other processes, then yes, you'll need
a different system.

This is basically all web applications, or even just saving out to disk.

Equality should be used to uniquely define data.

A mutable thing that has no identity apart from its value is a value
type. Don't use classes for that, or everything breaks down, because a
mutable class always eventually reveals that it's not a value.

In a non-distributed application comparing references is also an
implicit comparison of the entire object graph referenced by that
reference.

I don't think so. It's possible for references x and y to have exactly
isomorphic object graphs, but still x !=== y.

When you allow any other definition of equality for reference types,
unless that comparison explicitly includes all values in the each of
the object’s referenced object graphs, it is only *partial*
equality.

It still might not be equality. Equal things should be effectively
interchangeable (except for parts explicitly designated inessential,
such as an Array's capacity). As soon as you expose identity, that
falls apart.

Thus custom equality is a lie that should probably be expressed with
something more like ~=. It’s only equal up to the boundary, which is
arbitrarily defined.

So yes, I agree that at least equality should be consistent with
immutability, but in my opinion the only way to accomplish that is to
ban custom equality.

For all types, or for reference types? I'd be totally OK with banning
it for reference types. I'd disagree strongly with banning it for all
types.

I’m thinking all types, but maybe I’m over looking something here? There are three different ways equality is implemented for a linked list built with value types in this blog post

“We can also implement == for List the way users expect == to behave for value-semantic collections, by comparing the elements"

If it’s the way users expect, why not have == be defined as exactly that? And why not just have that automatically generated?

Incidentally, “indirect” for structs would mean it could be generated for structs as well. Best of all structs that do not contain any mutable reference types would be PureValues™, (Andrew or Matthew correct me if I’m misrepresenting PureValues) and would be able to implement any complex structure, without having to wrap the structs in reference types.

It could be enforce by the compiler that nothing inside of these could change. A function foo which takes array of PureValues, as a parameter would *always* return the same result if it is a pure function, no matter how it uses the array.

You're almost there. “Transitive” implies that you are going to look at
the parts of a type to see if they are also PureValue's. So which parts
of the Array struct does one look at, and why? Just tell me the
procedure for determining whether a type is a PureValue.

I’m jumping conversations, but shouldn’t this be every value in the Array struct? All internal state as well as all of the elements of the array.

I’m of the opinion that there are only two ways of accomplishing this.

EITHER

One could imagine a definition of equality that did explore the entire
object graph comparing values (only using references to find other
values, not for comparison) as it went. However, this this would not
be able to align with the semantics of immutability (maybe by only
allowing a single entry point into the graph which was guaranteed to
be a unique reference?).

OR

=== should be the only valid equality operator for classes (and you’re
right it should be spelled ==), and that if you want to compare
classes you should just put all of the data that acts as the
“identity” of that class in a value type which can be compared by
value. Value types could then have generated equality operators based
on the equality of each of their constituent values, some of which
could be references (but as I mentioned including references in the
identity does not work for distributed applications).

let x = MyClass(…)
let y = MyClass(…)

x.identityStruct == y.identityStruct

As it stands now in Swift, a class is more than just a reference. It
also includes all sorts of assumptions about it’s associated data
based on the fact that a class “pretends” to include the data it’s
associated with. Hence the need for custom equality operators.

I really think we’re on the same page here, probably. Or at least in
the same book.

Yes, although I don't understand a lot of what you're saying and my
instinct is that arguments about “whole object graph” are barking up the
wrong tree, or at least making it more complicated than necessary.

I guess same book, but different languages.

Suffice it to say, that if I were going to create an equality (==) operator for a LinkedList type (or any other complex data structure), implemented with reference types, there’s basically two valid ways to do it.

1. Recursively or iteratively explore the list and check that every value in the list is the same. (Could this not be auto-generated?)
2. Just compare the references of each list and call it a day.

What we have now are no guarantees that it will be either of those things. My understanding is that your approach would be number two. I don’t think it’s a drastic approach at all, and it is correct in one sense. But in another sense it’s not correct at all. If I read the list in from a file and want to compare it to another one I already have in memory, no dice. I’ll have to have a special function for that type of comparison.

Anyway, I’m not sure I’m adding much to this conversation. I apologize if I am being confusing.

···

On May 7, 2016, at 10:39 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Tyler Fleming Cloutier <cloutiertyler-AT-aol.com <http://cloutiertyler-at-aol.com/>> wrote:

On May 7, 2016, at 12:52 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Fri May 06 2016, Tyler Fleming Cloutier <cloutiertyler-AT-aol.com> wrote:

  On May 6, 2016, at 6:54 PM, Dave Abrahams via swift-evolution >>>> <swift-evolution@swift.org> wrote:
  on Fri May 06 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
      On May 6, 2016, at 7:48 PM, Dave Abrahams via swift-evolution >>>> <swift-evolution@swift.org> wrote:

There are still useful custom definitions of
equality as I have outlined above.

I think I am missing these. Could you provide an example?

,----[ quoting myself ]
> In my world, you can override equality such that it
> includes referenced storage when either:
>
> 1. the referenced storage will not be mutated
> 2. or, the referenced storage will only mutated when uniquely-referenced.
`----

Array<T> has a custom equality operator; IMO that's indispensable.

--
-Dave

Andrew_Trick · May 8, 2016, 9:20pm

If I read Andrew’s post correctly it sounds like it may also be of use to the optimizer in some cases.

I’ll just requote Dave’s example, which made perfect sense to me (so I’m not sure why there’s an argument):

To me that means, if the behavior of “f” only depends on
data reachable through this array, and f makes no mutations, then in
this code, the two calls to f() are guaranteed have the same effect.

     func g<T>(a: [T]) {
       var vc = MyViewController(a)
       vc.f() // #1
       h()
       vc.f() // #2
    }

But clearly, the only way that can be the case is if T is actually
immutable (and contains no references to mutable data), because
otherwise anybody can write:

   class X { ... }
   let global: = [ X() ]
   func h() { global[0].mutatingMethod() }
   g(global)

Conclusion: your definition of PureValue, as written, implies conforming
reference types must be immutable. I'm not saying that's necessarily
what you meant, but if it isn't, you need to try to define it again.

Yes, of course. If a PureValue contains a reference it must be immutable or only mutated when uniquely referenced.

There are other ways to communicate what the optimizer needs. I think the more interesting question is how users should express the value semantics of their types.

-Andy

···

On May 7, 2016, at 6:43 PM, Matthew Johnson via swift-evolution <swift-evolution@swift.org> wrote:

anandabits · May 9, 2016, 11:06pm

You haven't answered this question. How would you use this protocol?

I think the best example was given by Andy when discussing pure
functions. Maybe I want to write a generic function and ensure it is
pure. I can only do this if I know that any arguments received that
compare equal will always present the same observable state.

And that it doesn't touch any globals.

For example, maybe I wish to memoize the result.

I cannot write such a function for all T, and I also cannot write such
a function for all T that have value semantics if we adopt the
“references are values” view of the world.

Oh, you absolutely can, because if the function applies to all T that
have value semantics, it only has a few operations to work with:
initialization, assignment, and equality. Assignment is the only
mutating one of these.

I was implicitly assuming additional constraints exposing other behavior. I should have stated that explicitly.

I need an additional constraint that rejects things like
Array<UIView>. (T would obviously also be constrained by a protocol
that exposes the properties or methods my function requires to compute
its result)

Did you just start referring to T as the element type of the array
instead of the function's parameter type? I think you're
unintentionally pulling a fast one, reasoning-wise. It might help to
write down some actual code.

I was intending to refer to T as the element type of the array all along. The signature I was thinking of would look something like:

pure foo<T>(bar: [T]) -> SomeReturnType

I should have written this down to be clear about it.

In general, it would be used where you need to ensure that the result
of any operation observing the state of any part of the aggregate
value will always return the same value at any point in the future.
If I observe a[0].foo now I know with certainty the result of
observing a[0].foo at any point in the future.

Sure, but what you need then is a constraint on a's Element type that it
has value semantics, not some kind of new PureValue concept to use as a
constraint on the array itself.

That works if you follow all observable paths until you get to scalars. For example maybe A’s element is Array<Array<UIView>> and `foo ` is implemented in an extension off Array where Element == UIVIew (after we get same type constraints). Once we have conditional conformance that extension could conform to a protocol that makes `foo` visible and it would still have value semantics but it would not be a pure value.

···

On May 8, 2016, at 1:02 AM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

This aspect of preservation of observed values across time is
essential to the distinction between Array<LayoutValue> (see below)
and Array<UIView>. It doesn’t matter when I observe the frames of the
elements of Array<LayoutValue>, I will always get the same rects back.
With Array<UIView> that is obviously not the case as the frame of the
view could be mutated by anyone with a reference to the views at any
time in between my observations of the frame values.

struct LayoutValue {
  frame: CGRect
}

          let t = MyClass()
          foo.acceptWrapped(Wrap(t))
          t.mutate()

          In this example, foo had better not depend on the wrapped instance
          not
          getting
          mutated.

          foo has no way to get at the wrapped instance, so it can't depend on
          anything about it.

          Ok, but this is a toy example. What is the purpose of Wrap? Maybe
          foo
          passes the
          wrapped instance back to code that *does* have visibility to the
          instance. My
          point was that shared mutable state is still possible here.

          And my point is that Wrap<T> encapsulates a T (almost—I should have
          let
          it construct the T in its init rather than accepting a T parameter)
          and
          the fact that it's *possible* to code something with the structure
          of
          Wrap so that it has shared mutable state is irrelevant.

          The point I am trying to make is that the semantic properties of
          Wrap<T> depend
          on the semantic properties of T (whether or not non-local mutation
          may be
          observed in this case).

      No they do not; Wrap<T> was specifically designed *not* to depend on the
      semantic properties of T. This was in answer to what you said:

              A struct wrapping a mutable reference type certainly doesn’t
          “feel” value semantic to me and certainly doesn’t have the
          guarantees usually associated with value semantics (won’t
          mutate behind your back, thread safe, etc).

      I have been trying to get you to nail down what you mean by PureValue,
      and I was trying to illustrate that merely being “a struct wrapping a
      mutable reference type” is not enough to disqualify anything from being
      in the category you're trying to describe. What are the properties of
      types in that category, and what generic code would depend on those
      properties?

Again, the key questions are above, asked a different way.

--
-Dave

--
-Dave

L_Mihalkovic · May 8, 2016, 11:42am

       The primary reasons I can think of for creating reference types with
       value
       semantics are avoiding copying everything all the time or using
       inheritance. (I
       could also list pre-existing types here but am not as concerned with
       those)

       One could argue that you can avoid copying by writing a struct with a
       handle and
       one can simulate inheritance by embedding and forwarding. The problem is
       that
       this involves a lot of boilerplate and makes your code more complex.

   The “forwarding boilerplate problem” is something we need to solve in
   the language regardless.

Yes I agree that it needs to be solved regardless. In fact, you might remember
that I invested quite a bit of effort into drafting a proposal on the topic. I
shelved it mostly because I became very busy with client work, but also partly
due to the lukewarm reaction.

   The fact that we don't have an answer today
   shouldn't prevent us from adopting the right model for values and
   references.

I think that depends on what you mean by this. If you mean providing a default
equality of reference identity for reference types I disagree. I think that
should wait until the language reaches a place where there is no good reason to
write value semantic reference types. And I believe the boilerplate currently
required to wrap them in a struct is sufficiently burdensome that this is not
the case yet.

As I've said, we can't wait. We should make the change and use that to
drive development of the necessary features to reduce the burden of
writing optimized code.

Remember that the only value semantic reference types are immutable, so
the struct rendition of such types has only immutable properties.
Personally, I don't think that transforming

       struct X {
         ...
       private:
         let prop1: Type1
         let prop2: Type2
         let prop2: Type3
       }

into

       struct X {
          ...
       private:
         class Storage {
           let prop1: Type1
           let prop2: Type2
           let prop2: Type3
         }
         let value: Storage
       }

is so awful if you find you need to optimize away some reference
counting manually; you just need to add “.value” to property accesses in
X's methods, and this doesn't require any forwarding.

FWIW +1. pandora’s box never fully seals back... so it is easier to force explicit extra syntax early and later decide that under some special conditions the requirements can be relaxed. Considering the timeline for 3.00 as well as its associated stake, the aforementioned syntax represents a coherent (if not yet fully minimizing of user’s efforts) programing model that could be lived with for the foreseeable future, all the while not closing the door on certain types of future simplification.

For example the following syntax is a straw man representation of what could be done at a future date to reduce the boilerplate:

       struct X {
         ...
       private:
         @strawman_syntax1 let prop1: Type1
         let prop2: @strawman_syntax2 Type2
         strawman_syntax3 let prop2: Type3
       }

To extend on ‘value semantic reference type’, esthetic considerations would push me towards a solution expressing it at the usage site (like in the straw man syntax above), rather than via the introduction of a new protocol to be used at the definition site. i.e. favoring "my usage here of Type1 instances is to be construed as having value semantic” (with the compiler or me crafting a different definition of identity) over the less self documenting “Class Type1 is for all eternity carrying value semantic”, which ultimately seems more confusing to me.
I call it an esthetic argument as it is based on nothing else than a personal view on symmetry: (for no good reason) I value symmetry as the sign of a *good* design. Struct and Class are structural constructs to express a specific notion in the language. Out of my desire for symmetry I would expect that anything altering my perception of this notion would have to be equally conveyed within the structure of the language. Resorting to protocol compliance to do the same strikes me as the kind of deep imbalance I try to avoid.

/LM

···

   So far, I still don't believe that introducing a “pure values”
   distinction is adding simplicity and clarity. To me it looks like
   a needless wrinkle.

Fair enough. I suspect that many folks who have been strongly influenced by
functional programming may have a different opinion (btw, I don’t mean to imply
anything about the degree to which functional programming has or has not
influenced your opinion).

--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Andrew_Trick · May 9, 2016, 8:01am

I just had a chance to digest Dave's answer. It explains a lot.

PureValue was defined in terms of the type's physical representation:
- A struct with no reference properties
- Recursively, a reference to immutable or uniquely referenced memory.

It's defined such that we can say Array<T> is a PureValue iff T is a PureValue.

There is currently no procedure for determining PureValue because we have no way to declare that references are immutable or uniquely referenced. It would be a promise by the developer.

Now attempting to look at it from Dave's direction, value semantics apply to the variable's type, not the object's physical representation:

let v2 = v1
f(v1)
assert(v1 == v2)

If everything is a value, then this always works. Great!

If the variable's type does not allow mutating shared state, then operations on the variable are operating on a value.

protocol ValueP {
func compute() -> Result // nonmutating
}

func g(v1 : ValueP) {
  let v2 = v1
  v1.compute()
  assert(v1 == v2)
}

Nice. ‘compute' cannot change the value. Those value semantics do not tell me anything about shared state or function purity. For that, I need some additional constraint on 'compute'. Knowing that it does not mutate the 'self' value is insufficient.

One way of doing that, for example, is to declare that 'compute' transitively cannot access globals *and* ValueP must be a PureValue. Now I can safely write this:

protocol ValueP : PureValue {
@strawman_noglobal func compute() -> Result
}

/// Return (v1.compute, v2.compute)
func g(v1 : ValueP, v2 : ValueP) -> (Result, Result) {
  let r1 = v1.compute()
  if v1 == v2 {
    return (r1, r1)
  }
  return (r1, v2.compute())
}

So, Dave is right that we need to decide soon whether we can make stronger assumptions about value semantics. But that is a separate question from how to express function purity. I don't think there is any urgency in introducing things like the PureValue protocol or @strawman_noglobals attribute, now that we have clearly established shared-state-mutability-by-default. When we want to seriously have that discussion, we should consider other alternatives. I would prefer to wait until indirect structs and improved CoW support have had more discussion.

-Andy

···

On May 7, 2016, at 11:51 PM, Dave Abrahams <dabrahams@apple.com> wrote:

Does Array<T> have value semantics then only if T also has value
semantics?

This is a great question; I had to rewrite my response four times.

In my world, an Array<T> always has value semantics if you respect the
boundaries of element values as defined by ==. That means that if T is
a mutable reference type, you're not looking through references, because
== is equivalent to ===.

Therefore, for almost any interesting SomeConstraint that doesn't refine
ValueSemantics, then

Array<T: SomeConstraint>

only has value semantics if T has value semantics, since SomeConstraint
presumably uses aspects of T other than reference identity.

anandabits · May 9, 2016, 11:06pm

This depends on the type. For types representing resources, etc it works just
fine. But for models it does not work unless the model subgraph is entirely
immutable and instances are unique.
I agree that it isn't a good idea to provide a default that will
certainly be wrong in many cases.

Please show an example of a mutable model where such an equality would
be wrong.

This is somewhat orthogonal to the main points I have been making in
this thread. I have been focused on discussion about reference types
that have value semantics and the distinction between value semantics
and pure values. In any case, here you go:

let a: NSMutableArray = [1, 2, 3]
let other: NSMutableArray = [1, 2, 3]
let same = a === other // false
let equal = a == other // true

That's not proof that an == for NSMutableArray that matches the behavior
of === would be wrong, just that it would be different from what we
currently have.

Reference equality does not match the behavior of many existing
mutable model types. You seem to be making a case that in Swift it
should.

Yes.

But that is a separate discussion from the one I am trying to engage
in because mutable reference types *do not* have value semantics.

Then maybe I should disengage here?

Okay then, what algorithms can you write that operate on PureValue that
don't work equally well on Array<AnyObject>?

You haven't answered this question. How would you use this protocol?

I answered elsewhere but I’ll repeat that one use that immediately
comes to mind is to constrain values received in the initializer of a
(view) controller to ensure that the observable state will not change
over time.

My claim is that substituting the constraint of “it has value
semantics,” while presumably looser than the PureValue constraint, would
not compromise the correctness of your view controller, so not only is
the meaning of PureValue hard to define, but it doesn't buy you
anything. If you want to refute that, just show me the code.

This is not an algorithmic use but is still perfectly valid IMO.

If the properties of PureValue matter to your view controller, there's
an algorithm somewhere that depends on those properties for its
correctness.

In many cases it may just be view configuration that depends on those properties. I suppose you can call view configuration code an algorithm but I think that would fall outside of common usage.

If I read Andrew’s post correctly it sounds like it may also be of use
to the optimizer in some cases.

FWIW, I'm only interested in how you use this protocol in the
programming model, and I'm not even sure Andrew is talking about the
same constraint that you are.

I am also primarily interested in the programming model. That said, as far as I can tell everything Andrew has said is talking about the exact same thing I am. Andrew, if I have said anything that doesn’t align with the constraint you’re talking about please let me know!

···

On May 8, 2016, at 12:51 AM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

On May 7, 2016, at 3:03 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

--
-Dave