Should weak be a type?

Nickolas_Pohilets · February 29, 2020, 4:07pm

Why? Side tables are reference-counted themselves. When all the weak references are destroyed, side table will be deallocated. Side table does not reference anything but the relevant object, and after it gets destroyed side table references nothing. So, there cannot be a retain cycle.

Nickolas_Pohilets · February 29, 2020, 5:29pm

I'm not the first one who tried to write a WeakRef<T> and probably not the last one:

Also related to this:

swift - Unmanaged Object and Protocol - Stack Overflow

So, the case of conformance to AnyObject is an important one. But in Swift, normally protocols don't conform to themselves. And I think we can do better than making AnyObject magical and behave differently from other protocols. I believe we can come up with a solution that works for any protocol.

I don't think any protocol should conform to itself - there are good reasons why that is not possible. But if we take a step back - that is actually not needed. I think in cases where people hit the "protocol does not conform to itself" error, what people are trying to express is "a type that can be converted to existential of P", rather than "a type of that conforms to P":

Given:

protocol P {}
protocol Q: P {}
struct S: P {}
struct R: Q {}

Only types S and R conform to P. But all of the types (P, Q, R, S) can be converted to P. They are subtypes of P.

I think what we are missing here, is the subtyping as a generic constraint.

In terms of syntax, currently the : symbol in generic constraints is overloaded - it means subclassing when RHS is a class, and protocol conformance when RHS is a protocol. Subclassing is a custom case of subtyping, conformance is not. If we want to be able to express subtyping constraint for existential type, we need to disambiguate two of these cases.

I can suggest two options to do his:

Separate symbols for subtyping and conformance. For example : for conformance and :> for subtyping. Then T: P would mean that T should conform to P and T cannot be an existential type. And T :> P would mean that T should be a subtype of P, and T can be an existential type, including P itself. For classes we can keep : to mean subclassing for backwards compatibility. But since for classes there is no difference between subclassing and subtyping - we end up having two ways to express the same thing.
Disambiguate how protocol is used in the RHS side. For example P would mean a protocol, and any P would mean an existential type for P. Then T: P would mean conformance, and T: any P would mean subtyping.

AlexanderM · February 29, 2020, 6:42pm

Yep, that would work, but it wouldn't solve your problem, as I understand it:

False positives in equality lead to bugs - if I were using ObjectIdentifier as a comparison key, there is a small chance that old object (some sort of delegate) would be destroyed, new one would be created at the same address, but framework would never propagate a reference to the new one.

John_McCall · February 29, 2020, 7:40pm

Okay, so if I understand correctly, you want to keep one of these references around as a record of the last value you propagated. It's okay if your reference allows the object to be deinitialized — i.e. deinit is called and the stored properties are destroyed — but it can't allow the object's memory to be deallocated because then a new object could be allocated there and cause semantic problems. But it's okay in your case to keep such a "zombie" around because it won't be for too long:

If the old object is destroyed, it's because the model is no longer referencing it, which means something has changed in the model.
When something changes the model like that, it should always be notifying your framework that it might need to update.
Your framework will compare the new reference value with the old reference, recognize that it's changed, and trigger an update.
The update will propagate the new value and, accordingly, replace its record of the last value it propagated.
That will destroy the zombie reference, which should allow the zombie to finally be deallocated.

If that's accurate, then the main problem is what I mentioned before: while the native Swift object model (used for classes that don't inherit from ObjC classes) does distinguish deinitialization from deallocation in the way you want, we don't really have a reliable way to make that work for Objective-C objects. We may be able to support it for most Objective-C objects when running on a future Apple operating system, but that's not in place today, and I don't know if it'd be good enough for your framework.

If you don't care about Objective-C objects, then it's likely that just storing an unowned reference next to your ObjectIdentifier (without ever reading the former) will work. In principle, the optimizer could recognize that the unowned reference is never used and remove it, so it would be better to have a language/library feature; but in the short term, it should work.

John_McCall · February 29, 2020, 7:50pm

AnyObject as a generic constraint does nothing but constrain the representation of the type. It doesn't have any additional semantic meaning; it has no other requirements. So saying that AnyObject shouldn't constrain the representation is basically saying that AnyObject should be meaningless as a generic constraint.

It avoids looking up the conformance every time, which is typically very good for performance but also means that we can allow multiple conformances of a type to a protocol within a process without inherent implementation confusion.

@objc protocols are special because their requirements must be implemented with Objective-C message sends, which both (1) precludes the need to pass around a conformance and (2) makes confusion unavoidable if there are multiple conformances in the process.

DevAndArtist · February 29, 2020, 8:26pm

Correct me if I‘m wrong. X in Weak<X> is the protocol as a type or simply an existential which itself does not conform to protocol the constraint. If we had any keyword and some more generalization then we could say extension any X: X {} which would let Weak<X> compile.

Andrew_Trick · February 29, 2020, 9:26pm

The obvious meaning is "my subtype is a class" regardless of whether AnyObject is a type or a type constraint. So confusion is likely for anyone who isn't thinking about implementing dispatch and representing conformances, which is basically everyone. At least I still find it confusing. There are two reasons I wanted to respond to this thread

Before a new thing is proposed, I want to make sure there's a full acknowledgment and understanding about why the old thing is so confusing.
I think it would be interesting to be able to programmatically trade off a little code size to improve heap size without erasing protocol types, and think that conformance lookup speed could be addressed in some cases. But I'll admit this probably isn't an important use case.

John_McCall · February 29, 2020, 11:19pm

I agree that when people run into this restriction, they are often confused. There isn't much we can do about this restriction now, though — adding a new type wouldn't let us change the ABI of existing types. And I do think that being explicit about which conformance is intended is important for ensuring soundness in cases where multiple conformances are dynamically known, which can happen with dynamic linking.

Nickolas_Pohilets · March 1, 2020, 4:25pm

Side tables have their own reference count and keep living even after object is destroyed while there are weak references left. So even if new object is allocated at the same address, it’s side table still will get a different address from an old side table.

And if side table was allocated at the same address - that means that previous one got deallocated first, which implies that there are no weak references left to compare.

Nickolas_Pohilets · March 1, 2020, 4:46pm

While it is not possible to replicate behavior of the unowned references for ObjC (call dealloc, but don’t free the memory), implementing fake side tables using associated objects is pretty straightforward.

I still need to read the reference in a defensive way.

Essentially, this assumption does not hold in our case:

An invalid back-reference is a consistency error that we should encourage programmers to fix rather than work around by spot-testing for validity.

It is a consistentcy error, but being notified about it through a crash in production is a too high price. But logging something would be helpful though.

So, I guess I could have a pair of unowned + weak references - weak would allow to safely read the reference, and unowned would prevent address reuse. But that sounds expensive. My understanding is that Swift has both weak and unowned to let developers to make a trade-off between amount of memory held after object is destroyed vs total amount of memory allocated. But in this case I’m getting the worse of the two worlds.

Nickolas_Pohilets · March 1, 2020, 5:00pm

@John_McCall, could you share your thoughts about subtyping as generic constraint in this context?

John_McCall · March 1, 2020, 6:48pm

You're right, and in fact we do something like this in the unowned implementation, so we could presumably extend the same concept.

Why? I mean, yes, this is probably supportable, but I'd like to know why you want to be able to turn the reference into a strong reference. In your use case, you already know the reference has changed, there's no need to check anything about the contents of the object.

John_McCall · March 1, 2020, 7:04pm

Well, now that I've read what you actually mean by this, it would be sufficient for this use case but possibly not very useful as a general feature. You want a constraint that basically just means "T can be converted to the existential type P". That constraint is broader than T: P in that it permits existential types that don't self-conform, but it limits the requirements you could use from P, since in order to do anything you would have to (1) have a T value that you (2) convert to P before invoking the operation, which means you couldn't use any static, init, or mutating requirements from P. My intuition is that this wouldn't be worth the complexity outside of the narrow case of a generalized AnyObject representational constraint.

Nickolas_Pohilets · March 3, 2020, 9:44am

As a framework author I don't, but my users do. They are passing around this reference to eventually be dereferenced.

That's a pretty subjective, but from my personal experience protocols with static/init are not so common outside framework code. Most of the protocols I see in the business-logic code are used in more Objective-C/Java-style.

Do I understand correctly that you concern about mutability is because constraint witness in this case would look like a function (T) -> P, so it would effectively return a copy?

While for existential subtyping casting to P does not change type of the value, so casting to P, mutating and then casting back is legal. But for general case of value subtyping this does not work: String is a subtype of String?, but if the later is mutated to be nil, if cannot be written back.

And actually, that makes sense. Contra-variance works this way. Function type (String?) -> Void is a subtype of (String) -> Void, but (inout String?) -> Void is not a subtype of (inout String) -> Void.

But now I'm starting to think that subtyping constraint may be not the correct typing tool for the WeakRef<T>. For WeakRef we need T to be something which has a strong pointer inside it, we want to read that pointer, get a pointer to the side table, replace original pointer with a pointer to the side table, and store it like this. And do the reverse replacement when loading.

For the general case of subtyping, a struct storing a value of T constrained by T: any P would be storing T and casting it to P as needed. But in WeakRef we cannot store a T.

John_McCall · March 3, 2020, 2:41pm

I thought you were just using this internally in your cache of the last propagated value. Your users would presumably ask you to propagate a strong reference.

Nickolas_Pohilets · March 4, 2020, 12:22pm

Framework users want to propagate a struct which among others contains a weak reference. And frameworks requires struct to be Hashable.

And also we have a guideline that says "Don't write func == and func hash(into:) by hand! Let compiler do it, and use property wrappers if compiler needs a hint". And a MR template with a checklist item saying "[ ] I did not wrote any func == and func hash(into:) by hand".

In the framework we provide:

@Ref - for comparing strong references by identity
@ObjectRef - for comparing ObjC protocols using isEqual: - operator == works for ObjC classes, but not the protocols.
@HashableRef - bad name, but for comparing existential contains where we know each of the conforming types also conforms to Hashable, so we compare by force-casting both sides to AnyHashable.
@HashableError - specialized version of @HashableRef for Error - has a fallback strategy of comparing code and domain.
And finally - @WeakRef - for weak references.

John_McCall · March 4, 2020, 8:45pm

Why does a value-centric reactive framework need weak references?