Should weak be a type?

Speaking of weak and AnyObject, I encountered this before:

protocol X: AnyObject {}

struct Weak<T: AnyObject> {
    weak var value: T?
}

weak var x1: X? // okay
var x2: Weak<X> // error: 'Weak' requires that 'X' be a class type

There's perhaps a good reason for this error, but I couldn't figure it out. It surly has nothing to do with weak, but adding "weak as a type" would probably need to overcome this.

That’s the limitation I was referring to above; it’s what Daryle and I were talking about. Non-@objc class-bounded protocol types are subtypes of AnyObject but don’t conform to it.

2 Likes

Another way to call this: a permanent memory leak.

While I see the utility in language support for durable unique identifiers for objects, there are better ways to do it (e.g. have a two-word weak ref, containing a copy of the object's UUID in addition to the pointer)

1 Like

Cool, well that would itself be a nice improvement to the generics system. It will be exciting to see these pieces phase in over time!

-Chris

I also find it baffling that protocol types constrained by AnyObject don't conform to AnyObject. AnyObject doesn't require anything that would prevent such a conformance. I understand this is a result of ABI decisions more than anything fundamental. A stored value of type P & AnyObject is expected to carry around its conformance "subtype->P" in addition to the class reference representing its value, presumably just to avoid runtime calls. @objc protocols are only special because the ObjectiveC ABI already demands a single pointer, so there's no way to carry around conformances even if we wanted to.

I think it would be neat to have a protocol type like P & AsAnyObject to force an objc-like ABI and thus selectively force conformance to AnyObject without type erasure, but it sounds like a lot of plumbing to make that happen, and probably not what people in this thread are asking for.

What I suspect is being proposed instead is more like a T : Reference constraint that could be used to enforce an underlying reference counted value without imposing AnyObject conformance at all.

I’m working on an in-house reactive programming framework. Framework heavily relies on comparing inputs to decide if something should be recomputed or update propagation can stop.

And data that users feed into the framework may contain pretty much anything including weak references and closures. To facilitate framework usage we ship a collection of property wrappers that help compiler synthesize equality conformance for user types.

False positives in equality lead to bugs - if I were using ObjectIdentifier as a comparison key, there is a small chance that old object (some sort of delegate) would be destroyed, new one would be created at the same address, but framework would never propagate a reference to the new one.

False negatives lead to redundant updates - negativity affecting performance, but preserving correct behavior.

I think for most of the user use cases weak references don’t outlive the referent, up to some unaccounted dispatch_async. So typically weak is used instead of unowned as a defensive measure.

Why? Side tables are reference-counted themselves. When all the weak references are destroyed, side table will be deallocated. Side table does not reference anything but the relevant object, and after it gets destroyed side table references nothing. So, there cannot be a retain cycle.

I'm not the first one who tried to write a WeakRef<T> and probably not the last one:

Also related to this:

So, the case of conformance to AnyObject is an important one. But in Swift, normally protocols don't conform to themselves. And I think we can do better than making AnyObject magical and behave differently from other protocols. I believe we can come up with a solution that works for any protocol.

I don't think any protocol should conform to itself - there are good reasons why that is not possible. But if we take a step back - that is actually not needed. I think in cases where people hit the "protocol does not conform to itself" error, what people are trying to express is "a type that can be converted to existential of P", rather than "a type of that conforms to P":

Given:

protocol P {}
protocol Q: P {}
struct S: P {}
struct R: Q {}

Only types S and R conform to P. But all of the types (P, Q, R, S) can be converted to P. They are subtypes of P.

I think what we are missing here, is the subtyping as a generic constraint.

In terms of syntax, currently the : symbol in generic constraints is overloaded - it means subclassing when RHS is a class, and protocol conformance when RHS is a protocol. Subclassing is a custom case of subtyping, conformance is not. If we want to be able to express subtyping constraint for existential type, we need to disambiguate two of these cases.

I can suggest two options to do his:

  1. Separate symbols for subtyping and conformance. For example : for conformance and :> for subtyping. Then T: P would mean that T should conform to P and T cannot be an existential type. And T :> P would mean that T should be a subtype of P, and T can be an existential type, including P itself. For classes we can keep : to mean subclassing for backwards compatibility. But since for classes there is no difference between subclassing and subtyping - we end up having two ways to express the same thing.

  2. Disambiguate how protocol is used in the RHS side. For example P would mean a protocol, and any P would mean an existential type for P. Then T: P would mean conformance, and T: any P would mean subtyping.

2 Likes

Yep, that would work, but it wouldn't solve your problem, as I understand it:

False positives in equality lead to bugs - if I were using ObjectIdentifier as a comparison key, there is a small chance that old object (some sort of delegate) would be destroyed, new one would be created at the same address, but framework would never propagate a reference to the new one.

Okay, so if I understand correctly, you want to keep one of these references around as a record of the last value you propagated. It's okay if your reference allows the object to be deinitialized — i.e. deinit is called and the stored properties are destroyed — but it can't allow the object's memory to be deallocated because then a new object could be allocated there and cause semantic problems. But it's okay in your case to keep such a "zombie" around because it won't be for too long:

  1. If the old object is destroyed, it's because the model is no longer referencing it, which means something has changed in the model.
  2. When something changes the model like that, it should always be notifying your framework that it might need to update.
  3. Your framework will compare the new reference value with the old reference, recognize that it's changed, and trigger an update.
  4. The update will propagate the new value and, accordingly, replace its record of the last value it propagated.
  5. That will destroy the zombie reference, which should allow the zombie to finally be deallocated.

If that's accurate, then the main problem is what I mentioned before: while the native Swift object model (used for classes that don't inherit from ObjC classes) does distinguish deinitialization from deallocation in the way you want, we don't really have a reliable way to make that work for Objective-C objects. We may be able to support it for most Objective-C objects when running on a future Apple operating system, but that's not in place today, and I don't know if it'd be good enough for your framework.

If you don't care about Objective-C objects, then it's likely that just storing an unowned reference next to your ObjectIdentifier (without ever reading the former) will work. In principle, the optimizer could recognize that the unowned reference is never used and remove it, so it would be better to have a language/library feature; but in the short term, it should work.

AnyObject as a generic constraint does nothing but constrain the representation of the type. It doesn't have any additional semantic meaning; it has no other requirements. So saying that AnyObject shouldn't constrain the representation is basically saying that AnyObject should be meaningless as a generic constraint.

It avoids looking up the conformance every time, which is typically very good for performance but also means that we can allow multiple conformances of a type to a protocol within a process without inherent implementation confusion.

@objc protocols are special because their requirements must be implemented with Objective-C message sends, which both (1) precludes the need to pass around a conformance and (2) makes confusion unavoidable if there are multiple conformances in the process.

2 Likes

Correct me if I‘m wrong. X in Weak<X> is the protocol as a type or simply an existential which itself does not conform to protocol the constraint. If we had any keyword and some more generalization then we could say extension any X: X {} which would let Weak<X> compile.

The obvious meaning is "my subtype is a class" regardless of whether AnyObject is a type or a type constraint. So confusion is likely for anyone who isn't thinking about implementing dispatch and representing conformances, which is basically everyone. At least I still find it confusing. There are two reasons I wanted to respond to this thread

  1. Before a new thing is proposed, I want to make sure there's a full acknowledgment and understanding about why the old thing is so confusing.

  2. I think it would be interesting to be able to programmatically trade off a little code size to improve heap size without erasing protocol types, and think that conformance lookup speed could be addressed in some cases. But I'll admit this probably isn't an important use case.

1 Like

I agree that when people run into this restriction, they are often confused. There isn't much we can do about this restriction now, though — adding a new type wouldn't let us change the ABI of existing types. And I do think that being explicit about which conformance is intended is important for ensuring soundness in cases where multiple conformances are dynamically known, which can happen with dynamic linking.

Side tables have their own reference count and keep living even after object is destroyed while there are weak references left. So even if new object is allocated at the same address, it’s side table still will get a different address from an old side table.

And if side table was allocated at the same address - that means that previous one got deallocated first, which implies that there are no weak references left to compare.

While it is not possible to replicate behavior of the unowned references for ObjC (call dealloc, but don’t free the memory), implementing fake side tables using associated objects is pretty straightforward.

I still need to read the reference in a defensive way.

Essentially, this assumption does not hold in our case:

  • An invalid back-reference is a consistency error that we should encourage programmers to fix rather than work around by spot-testing for validity.

It is a consistentcy error, but being notified about it through a crash in production is a too high price. But logging something would be helpful though.

So, I guess I could have a pair of unowned + weak references - weak would allow to safely read the reference, and unowned would prevent address reuse. But that sounds expensive. My understanding is that Swift has both weak and unowned to let developers to make a trade-off between amount of memory held after object is destroyed vs total amount of memory allocated. But in this case I’m getting the worse of the two worlds.

@John_McCall, could you share your thoughts about subtyping as generic constraint in this context?

You're right, and in fact we do something like this in the unowned implementation, so we could presumably extend the same concept.

Why? I mean, yes, this is probably supportable, but I'd like to know why you want to be able to turn the reference into a strong reference. In your use case, you already know the reference has changed, there's no need to check anything about the contents of the object.

Well, now that I've read what you actually mean by this, it would be sufficient for this use case but possibly not very useful as a general feature. You want a constraint that basically just means "T can be converted to the existential type P". That constraint is broader than T: P in that it permits existential types that don't self-conform, but it limits the requirements you could use from P, since in order to do anything you would have to (1) have a T value that you (2) convert to P before invoking the operation, which means you couldn't use any static, init, or mutating requirements from P. My intuition is that this wouldn't be worth the complexity outside of the narrow case of a generalized AnyObject representational constraint.

As a framework author I don't, but my users do. They are passing around this reference to eventually be dereferenced.

That's a pretty subjective, but from my personal experience protocols with static/init are not so common outside framework code. Most of the protocols I see in the business-logic code are used in more Objective-C/Java-style.

Do I understand correctly that you concern about mutability is because constraint witness in this case would look like a function (T) -> P, so it would effectively return a copy?

While for existential subtyping casting to P does not change type of the value, so casting to P, mutating and then casting back is legal. But for general case of value subtyping this does not work: String is a subtype of String?, but if the later is mutated to be nil, if cannot be written back.

And actually, that makes sense. Contra-variance works this way. Function type (String?) -> Void is a subtype of (String) -> Void, but (inout String?) -> Void is not a subtype of (inout String) -> Void.

But now I'm starting to think that subtyping constraint may be not the correct typing tool for the WeakRef<T>. For WeakRef we need T to be something which has a strong pointer inside it, we want to read that pointer, get a pointer to the side table, replace original pointer with a pointer to the side table, and store it like this. And do the reverse replacement when loading.

For the general case of subtyping, a struct storing a value of T constrained by T: any P would be storing T and casting it to P as needed. But in WeakRef we cannot store a T.