Compiler-enforced immutability for reference types

Coming from a C/C++ background, one of the language design topics that I have struggled to wrap my head around is Swift's immutability model.

Let's say that you've created an Image class that will be allocated on the heap. In C++, you could create an immutable instance of this class as follows:

const Image * image = new Image()

As a newcomer to Swift, you might think that the following code does the same thing:

let image = Image() 

Unfortunately, what this really does is create an immutable pointer to a mutable instance of the Image class. In C++, this would be expressed as follows:

Image * const image = new Image()

Is there any way to get the Swift compiler to prevent you from modifying a reference type—as in the first C++ code snippet above—or at the very least warn you if a reference type is being modified? To be clear, I don't think that this should be the default behavior, but I think that it should at least be possible.

We'll probably need the concept of unique object, move assignment, and shared immutable object. Otherwise we won't be able to guarantee that it won't be mutate by other references without heavily restricting ourselves.

FWIW, if it's of immediate need, and you're in control of the library, you could do mutable class/immutable struct pair, and convert between them when needed.

In C++, the const keyword prevents you from accidentally mutating an immutable value through another pointer:

const int * a = new int;
int * b = a; // error: cannot initialize a variable of type 'int *' with an lvalue of type 'const int *'

In Swift, the let keyword appears to work just like const for value types. Would it be possible to introduce a new keyword that works like const for reference types, as well?

I'm not sure I'm familiar with this design pattern. Are you proposing the creation of an immutable struct that wraps the API of a mutable class (via composition)? That seems like a lot of boilerplate code just to create an immutable value on the heap (or an immutable value that can also use class inheritance).

Another option would be to add a keyword to a reference type's definition to change its behavior from mutating to non-mutating by default. For example:

immutable class C {
    var data: Int
}

By adding the immutable keyword, you would be telling the Swift compiler that this class shouldn't be modified through a let reference except by a mutating method, just like a struct. Does this sound reasonable, or are there some major downsides to this approach that I haven't considered?

If you are in control of the library, if all of a class's properties are declared with let they are immutable, unless they are themselves instances of a class with var properties.

class Image {
  let property: Int // etc.
}

// const Image * image = new Image()
var image = Image() 

Note, however if any of the class's properties (except computed ones) are declared with var, the class is partially const, partially non-const.

To help ensure this, you could have a single stored property per inheritance level, e.g:

class Super {
  let properties: Properties
  struct Properties {
    var x: Int
    var y: Int // etc.
 }
}

class Sub: Super {
  let subProperties: SubProperties
  struct SubProperties {
    var z: Int // etc.
 }
}

The difference between C++ and Swift is that Swift doesn't have the concept of pointer. the const in C++ relies on you being able to access the underlying storage, setup some data, then restrict the mutability later.
If we only do the mutability annotation, you'll lost mutability as soon as you get out of init. Any later and you'll rink exposing the storage via other references.

You'll also need rely on the type author to provide adequate init to cover most use case. This is not much of a problem in struct because struct is a value type that supports copying out of the box. With notion of identity in class, it could be tricky. That's why I think we'll need unique reference to allow this part.

I'm thinking of having class being a thin reference-box wrapping around struct. With help of dynamicMemberLookup, it shouldn't be too much of a boilerplate. Though one-per-class could still be considerable.

@dynamicMemberLookup
class Mutable {
  struct Storage { ... }

  var storage: Storage
  subscript<U>(dynamicMember keyPath: WritableKeyPath<Storage, U>) -> U {
    get { storage[keyPath: keyPath] }
    set { self.value[keyPath: keyPath] = $0 }
  }
  // Do we need another for read-only keypath?
}

Note that key path dynamic member lookup is coming in Swift 5.1. This would simulate the C++ pointer in Swift.

This again goes back to the mutability restriction mentioned earlier. Type author will need to provide adequate init to cover most use cases. This doesn't scale well for more complex class. Which I think should be address.

It's a bit more complicated than that. What you're describing is a read-only variable, not an immutable value. That is, it stops you changing the underlying representation via that variable, but there's nothing stopping the object from (for example) changing itself.

Also, what happens to object-valued properties of the object? Are they constrained to be read-only too? Immutable too?

In Swift, a let property is genuinely immutable. The value (or the reference, in the case of an object property) is guaranteed not to change during execution. Extending this to the values of objects (and their properties and sub-properties) is a subtly harder problem.

True…and, as you say, if any of those class's properties are themselves reference types, they won't necessarily be immutable (in fact, they probably won't be).

This is an interesting workaround, but again, if the single stored property contains any reference types, you're back to where you started.

Agreed.

Could you be a bit more specific? I'm not sure I follow you here…

I was thinking that you could have two different types of references: one that allows the referenced object to be mutated, and one that does not. Attempts to assign the former to the latter—or vice versa—would result in a compiler error. Put another way, the compiler would guarantee that any given instance of a reference type could either be modified by all references to it or by no references at all.

I'm still trying to parse this—could you provide an example of what you mean?

Yes, @dynamicMemberLookup and key path member lookup make the process of wrapping a type's API less cumbersome, but I wouldn't exactly consider either to be an elegant solution to this particular problem.

How does key path member lookup simulate C++ pointers in Swift? Isn't it essentially a mechanism for synthesizing computed properties without writing them all out by hand?

I can see the distinction, but my understanding was that, once created as a const variable, a C++ object is only allowed to change data members that have been explicitly marked mutable. (Of course, if it turns out that I'm wrong, I agree that my argument has a bit of a hole in it. :wink:)

Theoretically? Yes, in this case I would expect all of a reference type's properties to be immutable unless explicitly marked otherwise (e.g. unless they are modified by a function marked with the mutating keyword).

Yes…it's probably relatively straightforward to guarantee that a particular value (or set of values) in a stack frame will not change during execution, but extending that guarantee to the entire memory graph might be a bit trickier of a proposition. I'm not familiar enough with the LLVM internals to know how feasible it is, though.

I'm talking about initialisation. You must allocate memory, then mutate some data in the beginning, otherwise there'd be no point in having the reference to it. In C++, it lets you mutate some data via normal pointer, then unsafely convert it to const pointer. You need to keep track of the pointers in this beginning period manually to make sure that all mutating references to it never escape after you restrict the mutation.

The only safe place to setup non-mutating class (and so mutating it) is in init methods. Because as you've said, you can't assign a mutating class to a non mutating one, you must immediately assign the variable to the non-mutating class. So whatever setup you need to do, must get it done by the time init finish.

The non-mutation isn't exactly a problem per se. It's how you populate the non-mutating class with data. After initialisation, we surely can just do whatever, we might as well add the annotation you suggest.

Taking the Image class in your example, you may want to load data from url, setup width, height, optimise the data for certain pixel format, etc. Since we're assigning this class directly to a non-mutating reference (you can't assign it to mutating one, then change it later), you must be able to do all that within the init function.

With struct, you can have a mutating struct Image, do the necessary setup, then assign it to a non-mutating variable. What it does semantically is that it copies the values from mutating one to the non-mutating one. We can then safely discard the mutating copy.

The sentences are in wrong order, my bad :stuck_out_tongue:.
It's the wrapping that simulate pointer. Since now you have a pointer (Mutable) pointing to a storage (Storage). Key-path makes wrapping easier.

What is the purpose of an immutable class that is not addressed by a class that simply has no setters (either via computed properties or let stored properties)? You’ve provided an example above, but that seems addressed by a let stored property so I’m not sure what the value of the keyword is.

Oh, I see. Don't Swift class initializers already require all stored properties to be set before they exit, though?

Right, okay, that makes sense—if you couldn't copy a mutable instance of a reference type to an immutable one, you might have problems initializing it properly without a sufficiently flexible initializer. Good point. :+1:

In fact, you just made me realize that I probably misspoke earlier: you should actually be able to convert a mutable reference into an immutable one, but not vice versa. So, for example, you could use a mutable reference to set up an instance of a reference type within a function call and then return an immutable reference to the caller, but the caller could not then turn around and convert that immutable reference back into a mutable one. Alternatively, you could use a mutable reference to set up an instance of a reference type and then pass it as an argument to a function that takes an immutable reference to guarantee that the reference type will not be modified as a side effect of the function call. That should (hopefully) address the potential initialization problems that you identified above.

I can't remember exactly what got me thinking about this, but I think I may have been working with some DateFormatter instances without realizing that they were reference types. After spending some time tracking down a bug, I found that I had been changing a shared DateFormatter instance somewhere instead of modifying a local copy. I have to imagine that this type of mistake is relatively common because Swift doesn't appear to make any distinction between value and reference types at the call site—when using a new API, you really need to read the documentation closely to figure out what has been declared as a class and what has been declared as a struct. I had hoped that the ability to declare an instance of a reference type as immutable would help to avoid bugs like this by preventing you from inadvertently modifying a shared instance of a reference type, but I would be open to exploring other possible approaches, as well.

I was referring to this:

I tried to explore this direction as well. The problem is that nothing guarantees that the object will not escape through other reference variables. You can still setup data, save a mutable reference somewhere else, then return a (casted) immutable one. I came to a conclusion that you’ll at least need a unique reference to handle that.

Do note that most instance that admit immutability will likely forfeit identifiability. After all, you can utilize any instance that has the same underlying data, whether it’s the original object or a copy. This is much more inline with struct than class.

Most Foundation objects are class because that’s the only thing ObjC can work with. In those instances, you want less of an immutable data, and more of a local/non-escaping instance. Foundation do away with a NSCopying and requires user to copy an instance at the start of your local context.

Perhaps a concept of non-escaping class may also be a tempting addition.

Truth be told, I see immutable class as an optimization attempt, rather than a design problem. As most objects that would benefit from it would (aptly) be struct. While things like Foundation uses its own solution.

You know, I actually think this is okay. The more I mull it over, the more I'm coming to the conclusion that what I really want is a way to guarantee that an instance of a reference type will not be modified in a particular scope, not a way to prevent an instance of a reference type from ever being modified anywhere.

Tell me more… :thinking:

I'm thinking of marking a class variable as non-escaping. It may still be referenced in multiple places, but all of them must no escape outside of the original variable:

func foo() {
  do {
    // bar must be `init`ed here, not taking outside reference.
    nonescaping var bar = ...
    nonescaping var sameBar = bar
    ...
    // `bar` and `sameBar` must never leave this place.
  }
}

The downside is that you'll probably need to annotate the function parameter as well. I think it's ABI additive, but I'm not too sure. Also that the default is currently escaping so it'd be tricky to get other frameworks to adopt this.

May as well add one for class instance:

class Foo {
  // Must be created inside `init`,
  // and never leave the enclosing `Foo` instance
  nonescaping var bar: ...
}

I'm not sure if I want [ this + move operator ], or [ unique reference + move operator ] more.

Oh, interesting! I think it's orthogonal to the idea of enforcing immutability for reference types, but it might be worth considering separately.

You know, I just had another thought: passing a reference type as a function parameter has more or less the same semantics as passing a value type as an inout parameter. I'm not convinced that in-out behavior is always necessary—or even desirable—for reference types; it would be nice to have a way to selectively opt out of that behavior if desired.

What do you think? Would you find immutable reference types in function parameters useful to avoid unintended side effects?

You know, I just realized that you can't even apply the nonmutating keyword to instance methods for reference types if you wanted to:

error: 'nonmutating' isn't valid on methods in classes or class-bound protocols

What's the justification for this decision? At the very least, wouldn't it be helpful to have the ability to mark some instance methods as nonmutating to prevent you from inadvertently modifying a reference type's state?