Avoiding unbreakable reference cycle with value types and closures?

ownership
memory-safety

(Max Desiatov) #1

When working with UIKit and Foundation, which quite frequently use classes storing escaping closures, we're used to capturing those classes as weak to avoid reference cycles:

class CaptureClass {
  var x = 42
  var c: (() -> ())?
}

let capture = CaptureClass()
capture.c = { [weak capture] in
  capture?.x += 1
}

It's very obvious and expected that reference types are captured in closures by reference. I was quite surprised though to discover that value types are captured by reference as well and there's no way to avoid a reference cycle in that situation:

struct CaptureValue {
  var x = 42
  var c: (() -> ())?
}

var capture = CaptureValue()
capture.c = {
  capture.x += 1
}
// `capture` will never be deallocated
// because it's owned by closure `capture.c`

The problem here is that you can't add [weak capture] in a capture list as Swift compiler will complain with 'weak' may only be applied to class and class-bound protocol types. This doesn't actually make much sense to me: why is a strong reference allowed to be created to a value type, but a weak reference isn't? In this situation I would also expect a copy specifier in a capture list available for value types (or __consuming in Swift 5?), but that's not available too if I understand correctly?

I find it quite frustrating, especially as value types are a default choice in many cases due to the fact that you get compile-time (im)mutability guarantees with let and mutating. What I find even more confusing is that there's no way to avoid these reference cycles with value types and closures without rearchitecting everything from scratch, while with classes it's as easy as sticking an explicit capture list with a weak specifier. And you wouldn't even get a compiler warning when creating a reference cycle with value types.

So here are a few one-line pitches that could fix this problem (and these aren't mutually exclusive):

  • Why not allow weak reference capture of value types in closures? This would make a perfect sense as value types are captured with a strong reference by default, so why no weak references then?
  • Why not allow something like copy (or __consuming in Swift 5) in closure capture lists for value types? In a lot of scenarios a user could expect an instance of a value type to be copied into a closure environment, which would also allow to avoid unintended reference cycles?

Hope this makes sense, but would also be very happy to discover if I'm missing anything and there already is a way to break these reference cycles with closures and value types.


(Sebastian) #2

Two thoughts:

  • If var capture = CaptureValue() is so significant that you want it to deallocate when its scope "ends" (what's the term here?), then it's probably not really a "value" that can be copied and passed around. You want it to have some sort of identity, so that the "original" can deallocate.
  • Would you be able to capture a value weakly, you would lose the ability to have it immutable. Constants can't be weak (and shouldn't be in the future, imho).

(Kem Chen) #3

There is a feature called "Capture List". You could use it to control how values are captured in a closure.

struct CaptureValue {
  var x = 42
  var c: (() -> ())?
}

var capture = CaptureValue()
capture.c = { [capture] in 
  capture.x += 1
}

print(capture.x) // 42
capture.c?()
print(capture.x) // still 42

(Max Desiatov) #4

Thanks for linking to the capture list docs, as I've mentioned capture lists as well in the first post in this thread. I do use a capture list to make sure a CaptureClass instance is captured weakly, which doesn't work for CaptureValue though. The example you've shared won't compile though with this error: left side of mutating operator isn't mutable: 'capture' is an immutable capture.


(Max Desiatov) #5

I wouldn't say that my complaint has anything to do with identity or scopes. I wouldn't mind the value being deallocated a bit later or at some other point. The main problem is that after creating this reference cycle there's no way to deallocate it at all, especially when it's out of scope. I would be quite surprised if value having an identity or not had any impact on how memory allocation works for them. A user might use a big number of huge String and Data instances. You might argue those instances don't have any identity, but they are deterministically deallocated as expected.

I totally agree with this one. My point though is about an inconsistency in capturing a value stored as var, which is expected to be mutable. I don't think there would be a huge problem if that could become weak as well if requested by a user of the language, only benefits from the fact that you can break a reference cycle of this kind.


(Sebastian) #6

Ah true. So you'd have a memory leak of basically one value clutching at itself :joy:
My personal impulse is that I want to be able to declare weak closure references, since closures are reference types anyway, but I'm sure there's a technical reason for why weak closures are not possible...


(Max Desiatov) #7

Yeah, that one would be great too. Although this probably wouldn't break the reference cycle in the example code I've shared here, I do think it would make perfect sense to allow weak variables holding closures. I do have a few examples of those causing reference cycles too, but those look a bit different and more contrived.


(Joe Groff) #9

A struct by itself can never be responsible for a cycle, because it has no fixed location in memory. It doesn't make sense for a value to be weak because it has no shared identity; a weak reference would always immediately go to nil since there are no shared strong references to keep it alive. For example, if you captured a copy of capture, and changed the value of capture to something else, there would no longer be a cycle. What's creating the cycle in your second example is the var itself. Each var is itself essentially a small class object. Now, if you copied the value inside capture somewhere else, it would probably not do what you're trying to do here, since the closure would still be acting on the value inside the capture var and not your copy at that point. It seems to me like you really intend capture to be an object with shared identity.


(Grzegorz Leszek) #10

I think it is a semantic problem.

A closure is basically a small class. If you wrap a Class in a Struct, you don't have a true value type. you just have a reference to a memory wrapped in a Struct.

If you really would like to use struct and a closure and break the reference cycle, you could just wrap your closure in a class like so:

class Closure {
    var task: (() -> ())?
    init(_ task: (() -> ())?) {
        self.task = task
    }
}

Then in your example capture will be deallocated.

For me, if you want to assign a closure to a property, keeping it as a class make more sense (as in your first example).


(Sebastian) #11

This might point to why I got confused. So when the closure mutates a captured value, that value gets mutated in the original outer scope as well. Then when the outer scope deallocates, the closure practically works on its own copy ...


(Max Desiatov) #12

This one's very good point and it does make perfect sense. What actually got me thinking about this is that I find it a bit inelegant that Swift conflated a concept of classes and reference types. When you do need to have shared state that can be perfectly expressed as let's say an enum or a struct, you're still inclined to use classes either directly or as wrappers. And classes have their own baggage, like inheritance (classes aren't final by default), initializer rules (convenience/required rules etc). I understand this conflation was mostly done for first-class (no pun intended!) compatibility with Objective-C, but wouldn't it be more elegant if Swift had separate standard immutable and mutable reference types like Rust has with & and &mut?

A naive implementation could look like this:

final class ClassReference<T> {
  var value: T

  init(_ value: T) {
    self.value = value
  }
}

public struct Reference<T> {
  fileprivate let reference: ClassReference<T>

  public var value: T {
    get {
      return reference.value
    }
  }

  public init(_ value: T) {
    reference = ClassReference(value)
  }

  public init(_ mutable: MutableReference<T>) {
    reference = mutable.reference
  }
}

public struct MutableReference<T> {
  fileprivate let reference: ClassReference<T>

  public var value: T {
    get {
      return reference.value
    }
    set {
      reference.value = newValue
    }
  }

  public init(_ value: T) {
    reference = ClassReference(value)
  }
}

public struct WeakReference<T> {
  private weak var reference: ClassReference<T>?

  public var value: T? {
    get {
      return reference?.value
    }
    set {
      guard let v = newValue else {
        reference = nil
        return
      }

      reference?.value = v
    }
  }

  public init(_ strong: Reference<T>) {
    self.reference = strong.reference
  }
}

If Swift supported typealias operators, it could be quite handy to have something like this as well:

// pseudocode
typealias &T = Reference<T>

An obvious use case for it is shared state where you can explicitly make it immutable when needed, while still keeping it readable and up to date after it's been modified:

enum State {
  case foo
  case bar
}

let state = MutableReference(State.foo)

let result1 = stateModifier(state)
let result2 = stateReader(Reference(state)) // making it immutable here

(Max Desiatov) #13

That's correct, what's worse (or better depending on how you look at it) is that you can have more than one closure capturing shared state, even if that state is a value type. Here's an example that I previously thought couldn't work at all, but it actually does:

func referenceWithoutClasses<T>(_ initial: T) -> (() -> T, (T) -> ()) {
  var state = initial

  return ({ state }, { state = $0 })
}

let (getter, setter) = referenceWithoutClasses(42)
// `state` is gone and deallocated here, right?

print(getter()) // prints 42
setter(43)
print(getter()) // prints 43, which is a bit unexpected

(Jordan Rose) #14

You should think of the thing being captured as the variable, not the value, unless you use a capture list. As for not having a way to break the cycle, you certainly do: set capture.c back to nil (or to another closure). It's only if you don't have any other references to capture that you've made a cycle…but that's the same as with classes.

(But yeah, if someone says you can't have reference cycles without classes, they're wrong, because variables have identity in some sense too.)


(Joe Groff) #15

Class references aren't really equivalent to either of those, a class behaves more like Rust's Arc<T>, albeit without the static checks for thread-safe access enabled by Rust's borrow model. Rust's & is Swift's implicit borrowing behavior for function arguments, and &mut is inout. Swift's separation of struct and class is not merely a holdover from Objective-C, but based on the idea that in practice, a type is generally intended to be used as either a value or a shared object.


(Max Desiatov) #16

Right, what I do find confusing in this asymmetry is a lack of compile-time checks for mutability in classes, especially similar to a requirement to mark functions as mutating in structs and var/let mutability checks. If I store shared state in a class, there is no easy way to make it read-only to a consumer of the state. While explicit MutableReference and Reference wrappers I used in examples above solve that, I have a feeling these either should be a part of the standard library or maybe some other more ergonomic way to do this.

Like in this example there really is no compile-time guarantee that stateReader contains well-behaving code and won't mutate sharedState:

final class State {
  enum Value {
    case foo
    case bar
  }
  
  var value: Value

  init(_ initial: Value) {
    self.value = initial
  }
}

let sharedState = State(.foo)

let result1 = stateModifier(sharedState)

// `sharedState` is still mutable here despite being declared with `let`
let result2 = stateReader(sharedState)

(Jordan Rose) #17

Part of the reason there's no mutability checking for classes is because it's (1) hard for the programmer to get right, and (2) doesn't provide any safety or optimization guarantees anyway. This is well-explored in C++, where having a const State & means that you can't mutate the state, but it doesn't mean that someone else can't mutate the state while you're using it. ObjC/Foundation's NSFoo/NSMutableFoo pattern is roughly equivalent to that, with an added culture (and a little bit of language sugar) around eagerly copying values so they don't change out from under you.

A truly immutable class has value semantics, but that's harder to get set up correctly in the first place. You'd really want some kind of "freeze" operation that says "okay, I'm done setting this up so now it's okay to share it", and a possible "unfreeze" for "okay, I know I have unique ownership; let me modify this again".

But rather than adding these mechanisms directly to classes, it might be more interesting to explore making Swift's efficient-copy-on-write patterns easier to adopt. (The same mechanisms allow for error-on-shared-write, although without the compile-time enforcement.)


Mutating Function on Class Type (Possible bug?)
(Joe Groff) #18

Yeah, an interesting extension of the language once we do have move-only types and a borrow checker would be to promote "is uniquely referenced" to the type system, so that you can have a unique ClassType reference and have its uniqueness maintained by the type system. That would be useful for auditing the correctness of COW containers, but would also allow the safety of class mutation to be checked by the compiler. Only mutations of uniquely-owned class instances would be multi-threading safe, and in a context where you ask for strict thread-safe ownership safety, we could conceivably enforce that.


(Erik Little) #19

This was considered out-of-scope for the immediate ownership work though, correct?


#20

Reference-type easily have reference cycle because we allow it to have multiple points of access(POA). For example,

let a = Class() // Class #1
let b = a // also Class #1

So we say that Class 1 has 2 POAs; a and b.

Semantically, value-type always have only one POA. For example,

let a = Value() // Value #1
let b = a // treated as Value #2

So Value 1 has only one POA, a. And Value 2 has only b. That’s why weak reference makes no sence for value type; it’ll turn nil The moment it is assigned.

The problem with closure is that it allows any variable to have multiple POAs regardless of value/reference semantic.

let a = Value() // Value #1
let c = {
   use(a) // also Value #1
}

Now Value #1 has 2 POAs one in the local context, and another inside the closure.
Despite both being called a, it should be treated as a different POA as it lies in different scopes, and so Value #1 will be inaccessible only when both scopes expire.

So I think the short-term way of addressing this would be to allow closure to weakly capture the value type.

let a = Value()
let c = { [weak a]
    use(a)
}

Though if we’re revamping the Memory model anyway, effort/pay-off may not be very attractive.


(Jordan Rose) #21

I'm not sure what you mean by "weakly capturing the value type". You could just write [a] and get the same effect, and if the original value is declared with let it's already got that effect. It's only var that introduces possible cycles.