[Idea] Pin down IteratorProtocol's mutation semantics


(Brent Royal-Gordon) #1

IteratorProtocol imposes some strange limitations and preconditions which basically boil down to "this protocol is fundamentally mutating, but we don't want to promise whether you're going to get value or reference semantics, so don't do anything that might behave differently depending on that, and by the way we can't possibly enforce that in the type system, so good luck finding any mistakes yourself". As far as I can tell, this is because some iterators *must* provide reference semantics (e.g. when reading from a socket), while others can be implemented perhaps more efficiently with a value type.

How bad would it be to force all iterators to provide reference semantics by making IteratorProtocol a class protocol? Particularly if we assume that iterators could generally be made `final`?

Alternatively, how valuable is it to specify that you can't use an iterator in a way that would expose whether it is a reference or a value type? The only iterators I can think of which would *need* to be reference types are ones which draw elements from an outside source; for those, it's unsurprising that copying doesn't "work". Are there cases I'm not thinking of?

Or have I totally misread the situation in some way?

···

--
Brent Royal-Gordon
Architechies


(Dave Abrahams) #2

IteratorProtocol imposes some strange limitations and preconditions
which basically boil down to "this protocol is fundamentally mutating,
but we don't want to promise whether you're going to get value or
reference semantics, so don't do anything that might behave
differently depending on that, and by the way we can't possibly
enforce that in the type system, so good luck finding any mistakes
yourself".

Actually we could possibly enforce reference semantics. I have asked
some optimizer folks around here to look into whether it would be
feasible to put a “: class” constraint on IteratorProtocol without
hurting performance, but I'm not sure when they'll be able to get to it.
If someone in the community would like to do some experiments, that
would be great.

As far as I can tell, this is because some iterators *must* provide
reference semantics (e.g. when reading from a socket), while others
can be implemented perhaps more efficiently with a value type.

It's all speculation until we see what the optimizer can do (and can be
trained to do). I believe we already are promoting some class instances
to the stack, but I could be wrong.

How bad would it be to force all iterators to provide reference
semantics by making IteratorProtocol a class protocol? Particularly if
we assume that iterators could generally be made `final`?

Alternatively, how valuable is it to specify that you can't use an
iterator in a way that would expose whether it is a reference or a
value type?

Well, that's not a problem specific to iterators at all. The general
problem is that:

* Reference semantics and value semantics are fundamentally different.

* We have no way to declare in the language that something has value or
  reference semantics. It's possible to write classes whose instances
  have value semantics (make it final and don't provide any mutating
  methods or writable properties) and structs that have reference
  semantics (just make it delegate mutations and comparison to a
  contained class instance), and we haven't outlawed doing so.

I suppose we could *almost* constrain something to having reference
semantics by making it a class protocol with a mutating member, except
for this error:

  'mutating' isn't valid on methods in classes or class-bound protocols.

…well, there is this horrible workaround:

  protocol HasMutatingMethod { mutating func f() }
  protocol HasReferenceSemantics : class, HasMutatingMethod {}

Although the semantic requirements on f ought to be enough:

  protocol HasReferenceSemantics : class {
    /// Updates `self` to its next state or whatever. <====
    func f()
  }

I'm sure the standard library is full of generic code that doesn't
actually work with reference types; I know of several specific
instances.

I used to think that we should have a language feature that allows us to
declare value semantics, but I am coming around to the idea that we
ought to simply require structs and enums to have value semantics (and
mutable classes, reference semantics—parenthesized because it just
happens naturally). I believe we could give some really great
diagnostics for mistakes in this area, both in the compiler and in a
separate static analyzer, and that we could generate correct default
implementations of == and < for almost everything.

The only iterators I can think of which would *need* to be reference
types are ones which draw elements from an outside source; for those,
it's unsurprising that copying doesn't "work". Are there cases I'm not
thinking of?

Or have I totally misread the situation in some way?

Not at all, except perhaps for thinking that the general problem is
about iterators. Still, it might be possbile as noted above to solve it
for iterators.

···

on Fri Apr 29 2016, Brent Royal-Gordon <swift-evolution@swift.org> wrote:

--
Dave