I'm moving the discussion from "Proposal proposal: @pure keyword" here and jumping in this time.
1. Pure means that the function always return the same value given the same arguments, and has no side effects (it purely computes a result from its inputs), making it possible for the compiler, or a cache, to reuse the result from a previous call. This is the simplest definition, and it provide strong guaranties. Let's call that "strongly pure".
2. Pure just mean that the function has no access to global variables. It can only mutate "outside" things through inout parameters or pointers (including class references) passed to it by the caller. So in the general case you can't reuse the results. But you can use this function to mutate the state inside a strongly pure one. A strongly pure function in this case is one with no inout or pointer in the signature.
There are two independently useful proporties, which I choose to call @pure and @noglobals. (Those names haven't been bikeshedded). However, I don't think @pure needs to be as strong as was suggested. A pure function can mutate its copy of any values that it takes and return those to the caller. It cannot access shared state--that is, state the may be reachable via other values. Put simply, it can't access heap objects unless we can guarantee those objects are uniquely referenced or immutable.
These properties can and should be inferred by the compiler. However, I feel that at least @noglobals should be the default, if not @pure, so that published APIs permit future optimization. If they aren't accepted as default, then we should aggressively add annotations to stdlib entry points. The compiler can then infer purity in higher level user code.
Note that I'm looking at this mainly from an optimizer-hint perspective, and haven't seriously considered supporting computed lets.
Is a method impure if it uses self? I suppose it could be. I guess self is
an inout parameter. I presume an inout parameter is a known expected.
side-effect.
I think normal inout parameters, including self are still @pure (closure captures are not pure). inout arguments do not have pointer-like semantics and are guaranteed not to alias for the purpose of side-effect visibility. The inout arguments simply need to be treated as results. Naturally, a @pure call cannot be eliminated if the inout argument is used after the call. Similarly, two mutating pure method calls on the same struct are not redundant. These facts are obvious to the optimizer, so I don't see the issue.
The value of the inout argument can mutate locally within a @pure function. If that argument is a struct, then those struct members can mutate.
The argument values, whether inout or not, can be copied without losing purity. Incrementing reference counts should not be considered a side effect from the perspective of function purity. It's true that we have an isUniquelyReferenced() API, but there is a requirement on the user of this API to ensure identical program behavior regardless of the return value (it is purely an optimization for CoW implementations).
All that's good in theory, but there is a major detail that needs addressing. Memory allocation breaks the guaranties of a "strongly pure" function. For instance, if you return a newly allocated object, or a struct with a pointer to an object, the object is going to be a different one every time. That object is mutable memory, and returning a different chunk of mutable memory is quite different in semantics from returning the same one. If you want strong purity guaranties when returning objects (and thus be able to optimize by reusing the result from a previous call), there needs to be a way to return objects that have a language-enforced guaranty of immutability... same for structs that can have a pointer to an object or other memory. Without immutability guaranties, `@pure` has almost no optimization value.
I think broader support for immutability requires a separate proposal. I will say that we would like to optimize around calls that allocate objects but are otherwise @pure (they can't be marked @pure because they are not idempotent). We can probably get a lot of mileage out of marking them @noglobals. If we need to do better, we could introduce a @pure_with_alloc sort of attribute later.
I'm concerned that with this definition we won't be able to mark many APIs
as pure, even though they actually are pure. The issue is that this
definition disallows local mutation. Consider CollectionType.sort() -- the
way it is implemented is that it first copies the collection into an array,
and then sorts that array in-place. sortInPlace() isn't pure, but because
the mutation happens on local state, the whole operation is pure.
Array<T>.sortInPlace() should definitely be considered @pure. Of course, that won't be inferred from the rules above because it's implementation uses UnsafeMutableBufferPointer, but we will annotate it as such. Chris L. already mentioned that we need this escape hatch.
There is a more general problem with CoW data types. Simply reading an array element is superficially impure because it accesses array storage. We would work around this using the same "force pure" mechanism because we know the storage is uniquely referenced or immutable.
This leads me to a much bigger issue though. Consuming a generic value is not necessarilly @pure because of deinit(). In fact it isn't even @noglobals. So this single argument nop function, which doesn't even have an inout, is impure and may access globals:
func foo<T>(t: T) {}
To fix this we need to:
- Assert that freeing a non-ObjC Swift object has no side effects other than it's deinit().
- Introduce a type modifier that prevents any subclasses or protocol conformance from introducing an impure deinit().
Then we could write generic code that guarantees purity:
func @pure foo<T : PureType>(t: T) {}
We still have a problem because our core library routines need to work on all types and we aren't going to accept an explosion of purity in the API. It's particularly problematic if we want @pure or @noglobal to be default.
This seems like it will require polymorphic effects/attributes that can be derived from the generic type parameters. Generally, we want to be able to say that foo<T>(t: T) is pure whenever T is a "pure type" as I explained above. This idea could be extended to support declaring a function purity conditionally depending on the purity (or @noglobals property) of any closure arguments.
So, to conclude, I'm strongly in favor of defining @pure and @noglobals semantics, getting the defaults right, and annotating APIs from the outset. However, I don't have a compelling design proposal short of introducing a significant language feature. Suggested alternatives or proposals for the necessary language support are welcome.
AndyT
···
On Sat, Jan 9, 2016 at 11:01 PM, Michel Fortin via swift-evolution < swift-evolution at swift.org> wrote:
On 9 Jan 2016, at 8:04 PM, Andrew Bennett <cacoyi at gmail.com> wrote:
On Sat, Jan 9, 2016 at 11:01 PM, Michel Fortin via swift-evolution <swift-evolution at swift.org> wrote:
On Sat, Jan 9, 2016 at 9:29 PM, Dmitri Gribenko <gribozavr at gmail.com> wrote: