Avoiding unbreakable reference cycle with value types and closures?

This one's very good point and it does make perfect sense. What actually got me thinking about this is that I find it a bit inelegant that Swift conflated a concept of classes and reference types. When you do need to have shared state that can be perfectly expressed as let's say an enum or a struct, you're still inclined to use classes either directly or as wrappers. And classes have their own baggage, like inheritance (classes aren't final by default), initializer rules (convenience/required rules etc). I understand this conflation was mostly done for first-class (no pun intended!) compatibility with Objective-C, but wouldn't it be more elegant if Swift had separate standard immutable and mutable reference types like Rust has with & and &mut?

A naive implementation could look like this:

final class ClassReference<T> {
  var value: T

  init(_ value: T) {
    self.value = value
  }
}

public struct Reference<T> {
  fileprivate let reference: ClassReference<T>

  public var value: T {
    get {
      return reference.value
    }
  }

  public init(_ value: T) {
    reference = ClassReference(value)
  }

  public init(_ mutable: MutableReference<T>) {
    reference = mutable.reference
  }
}

public struct MutableReference<T> {
  fileprivate let reference: ClassReference<T>

  public var value: T {
    get {
      return reference.value
    }
    set {
      reference.value = newValue
    }
  }

  public init(_ value: T) {
    reference = ClassReference(value)
  }
}

public struct WeakReference<T> {
  private weak var reference: ClassReference<T>?

  public var value: T? {
    get {
      return reference?.value
    }
    set {
      guard let v = newValue else {
        reference = nil
        return
      }

      reference?.value = v
    }
  }

  public init(_ strong: Reference<T>) {
    self.reference = strong.reference
  }
}

If Swift supported typealias operators, it could be quite handy to have something like this as well:

// pseudocode
typealias &T = Reference<T>

An obvious use case for it is shared state where you can explicitly make it immutable when needed, while still keeping it readable and up to date after it's been modified:

enum State {
  case foo
  case bar
}

let state = MutableReference(State.foo)

let result1 = stateModifier(state)
let result2 = stateReader(Reference(state)) // making it immutable here
1 Like

That's correct, what's worse (or better depending on how you look at it) is that you can have more than one closure capturing shared state, even if that state is a value type. Here's an example that I previously thought couldn't work at all, but it actually does:

func referenceWithoutClasses<T>(_ initial: T) -> (() -> T, (T) -> ()) {
  var state = initial

  return ({ state }, { state = $0 })
}

let (getter, setter) = referenceWithoutClasses(42)
// `state` is gone and deallocated here, right?

print(getter()) // prints 42
setter(43)
print(getter()) // prints 43, which is a bit unexpected

You should think of the thing being captured as the variable, not the value, unless you use a capture list. As for not having a way to break the cycle, you certainly do: set capture.c back to nil (or to another closure). It's only if you don't have any other references to capture that you've made a cycle…but that's the same as with classes.

(But yeah, if someone says you can't have reference cycles without classes, they're wrong, because variables have identity in some sense too.)

13 Likes

Class references aren't really equivalent to either of those, a class behaves more like Rust's Arc<T>, albeit without the static checks for thread-safe access enabled by Rust's borrow model. Rust's & is Swift's implicit borrowing behavior for function arguments, and &mut is inout. Swift's separation of struct and class is not merely a holdover from Objective-C, but based on the idea that in practice, a type is generally intended to be used as either a value or a shared object.

3 Likes

Right, what I do find confusing in this asymmetry is a lack of compile-time checks for mutability in classes, especially similar to a requirement to mark functions as mutating in structs and var/let mutability checks. If I store shared state in a class, there is no easy way to make it read-only to a consumer of the state. While explicit MutableReference and Reference wrappers I used in examples above solve that, I have a feeling these either should be a part of the standard library or maybe some other more ergonomic way to do this.

Like in this example there really is no compile-time guarantee that stateReader contains well-behaving code and won't mutate sharedState:

final class State {
  enum Value {
    case foo
    case bar
  }
  
  var value: Value

  init(_ initial: Value) {
    self.value = initial
  }
}

let sharedState = State(.foo)

let result1 = stateModifier(sharedState)

// `sharedState` is still mutable here despite being declared with `let`
let result2 = stateReader(sharedState)

Part of the reason there's no mutability checking for classes is because it's (1) hard for the programmer to get right, and (2) doesn't provide any safety or optimization guarantees anyway. This is well-explored in C++, where having a const State & means that you can't mutate the state, but it doesn't mean that someone else can't mutate the state while you're using it. ObjC/Foundation's NSFoo/NSMutableFoo pattern is roughly equivalent to that, with an added culture (and a little bit of language sugar) around eagerly copying values so they don't change out from under you.

A truly immutable class has value semantics, but that's harder to get set up correctly in the first place. You'd really want some kind of "freeze" operation that says "okay, I'm done setting this up so now it's okay to share it", and a possible "unfreeze" for "okay, I know I have unique ownership; let me modify this again".

But rather than adding these mechanisms directly to classes, it might be more interesting to explore making Swift's efficient-copy-on-write patterns easier to adopt. (The same mechanisms allow for error-on-shared-write, although without the compile-time enforcement.)

8 Likes

Yeah, an interesting extension of the language once we do have move-only types and a borrow checker would be to promote "is uniquely referenced" to the type system, so that you can have a unique ClassType reference and have its uniqueness maintained by the type system. That would be useful for auditing the correctness of COW containers, but would also allow the safety of class mutation to be checked by the compiler. Only mutations of uniquely-owned class instances would be multi-threading safe, and in a context where you ask for strict thread-safe ownership safety, we could conceivably enforce that.

13 Likes

This was considered out-of-scope for the immediate ownership work though, correct?

Reference-type easily have reference cycle because we allow it to have multiple points of access(POA). For example,

let a = Class() // Class #1
let b = a // also Class #1

So we say that Class 1 has 2 POAs; a and b.

Semantically, value-type always have only one POA. For example,

let a = Value() // Value #1
let b = a // treated as Value #2

So Value 1 has only one POA, a. And Value 2 has only b. That’s why weak reference makes no sence for value type; it’ll turn nil The moment it is assigned.

The problem with closure is that it allows any variable to have multiple POAs regardless of value/reference semantic.

let a = Value() // Value #1
let c = {
   use(a) // also Value #1
}

Now Value #1 has 2 POAs one in the local context, and another inside the closure.
Despite both being called a, it should be treated as a different POA as it lies in different scopes, and so Value #1 will be inaccessible only when both scopes expire.

So I think the short-term way of addressing this would be to allow closure to weakly capture the value type.

let a = Value()
let c = { [weak a]
    use(a)
}

Though if we’re revamping the Memory model anyway, effort/pay-off may not be very attractive.

I'm not sure what you mean by "weakly capturing the value type". You could just write [a] and get the same effect, and if the original value is declared with let it's already got that effect. It's only var that introduces possible cycles.

It's also worth noting that all of this can only happen with @escaping closures, because (1) with a non-escaping closure you know when the closure is going to be destroyed, and (2) you can't assign a non-escaping closure anywhere, which prevents the cycle from being formed in the first place.

What I tried to say is that, despite the fact that we use let a in both the local context and closure context, it should be treated as different point of access.
It doesn't get deallocated until both scopes expire (local context finishes execution and closure is deallocated), even semantically this would be the case.

We then get the behaviour similar to reference type that multiple points of access is pointing to the same instance.
So we could use the same solution by letting programmer decide which point of access is to be treated as weak.

Agreed. Though it doesn't undermine my POV above, still, we can then limit our attention to those capturing scenarios.

I see, you want a to be inaccessible (well, nil) after exiting the scope where it was originally declared. Can I ask what that would improve?

It really wouldn't do much more than avoiding reference cycle for value type, it's more of an ad-hoc suggestion than a full-fledge idea.

The thing I originally wanted to point out is that same variable in different scope should be treated as different points of access and that some ideas can go from there.

1 Like

Cycle discussion aside, I agree that this is pretty surprising (I've been writing a lot of Swift code, for years, and I never knew about it). I'm not sure how many people expect value-types to behave in this way.

var a = [1, 2, 3]
let closure = { print(a) }
a.append(4)
closure() // prints: [1, 2, 3, 4]

Jordan's comment about capturing the 'variable' rather than the 'value' makes it seem like intended behaviour, though. Is there some strong motivating reason for that?

5 Likes

I strongly agree with the last comment. This was also a surprise to me, I always thought that value types are captured by value with a copy, not by reference. I understand it's late to change the behaviour, but I wish that a copying capture was the default.

1 Like

That's how closures work in most languages that have closures and mutable lexical bindings, going back to Scheme. There are benefits to always capturing by copy, for sure, but then you limit what can be done with closures as control flow constructs. In something like x.forEach { a.append($0) }, you would expect a to receive the mutations that occur inside the loop.

12 Likes

While I don't use this behaviour either, I think it's easier to see if we throw in multiple closures.

func foo() -> (()->(), ()->()) {
    var a = [1, 2, 3]
    let printing = { print(a) }
    let adding = { a.append(4) }
    return (printing, adding)
}

let (printing, adding) = foo()
adding()
printing()
adding()
printing()

// [1, 2, 3, 4]
// [1, 2, 3, 4, 4]
1 Like

That's a really good point. Didn't think about that :thinking:

Yeah, this would still fall afoul of the "weak references need a strong reference to keep the value alive" principle. While it may avoid a cycle, it also means you couldn't usefully pass the closure out of the variable's original scope, since the reference would immediately go to nil, unless you happened to also have another closure somewhere else keeping the variable alive too. (And if you don't need to pass the closure out of the original scope, it's probably a good idea to try to make it work as a non-escaping closure instead.)

5 Likes