Macro meta-programming the reference storage types

Hello (and CC @Joe_Groff @John_McCall),

I’ve been maintaining a branch for a while now that adds two new reference storage types.[1] This has become mildly annoying to maintain because adding a new reference storage type involves a considerable amount of SIL and IRGen boilerplate.

I’ve been creating a branch over the last couple days that converts the reference storage types to use LLVM-style macro meta-programming with a swift/AST/ReferenceStorage.def file that roughly contains:

// In this context, "static" merely means no "dynamic" (i.e. runtime) checking.
ADDRESS_ONLY_DYNAMIC_REF_STORAGE(Weak, weak, /*misc*/)
LOADABLE_DYNAMIC_REF_STORAGE(Unowned, unowned, /*misc*/)
STATIC_REF_STORAGE(Unmanaged, unmanaged, /*misc*/)
REF_STORAGE_RANGE(Weak, Unmanaged)

The “misc” includes things like optionality, strength relative to the default reference ownership kind, etc. Before I create a pull request, can anybody see any problems with this approach? How should various IRGen TypeInfo data structures be meta-programmed? I can do the naive approach and just use macros, but I suspect that templates would be preferred (if possible).

For whatever it may be worth, this deduplication / clean up work has already saved about 300 lines of code so far and found a few oversights/bugs in the process.

Dave

[1] – I wish not to go into the details right now. Please let it suffice for me to say the two reference storage types are ReadLock and WriteLock, which is why the “etc” above include “strength relative to the default ownership kind”.

The basic idea seems reasonable. I’d caution against trying to include information in the metaprogram that probably isn’t going to be used in multiple places; so maybe optionality but maybe not whatever strength relative to the default means. The TypeInfo question is complicated and I don’t have a good answer for you.

Hi John,

Thanks! I suspected that was the answer for the TypeInfo data structures.

On the topic of optionality, SIL and IRGen are aware of the fact that ‘weak’ is optional and the underlying type is sometimes preferred over the optional wrapper when working with a given reference storage type. In contrast, SIL and IRGen seem to be unaware of the fact that the Clang importer creates optional unmanaged types, which are normally disallowed by Sema for pure Swift code.

I can change SIL and IRGen to be more permissive in this regard, and trust that if any reference storage type is optional, then Sema and/or the importer “knows what they’re doing”. Can you see any problem with this?

Thanks,
Dave

Hi John,

I’ve made good progress and I’ve deleted almost 700 lines of code in the process. The validation test suite passes on my Linux box. I haven’t checked my Mac yet. I’ll need you and probably other key contributors to eventually review the changes, which are massive and growing (48 files changed, 1845 insertions, 2515 deletions so far). In the mean time and as a heads up, here is the branch.

Dave

IIRC there’s a contributor who volunteered to work on allowing optional unowned values, so generalizing in a direction that would make that easier would be great.

1 Like

Interesting. From a technical perspective, that should just fall out of what I’m working on. It’ll just be a single line policy knob.

What’s the point of optional unowned? I thought “weak == optional” and “unowned == non-optional” made some amount of “sense”. If optionals become allowed on unowned, then people are going to ask for weak to allow non-optional types for symmetry, and then we have two identical features from a language perspective.

Weak always has to be optional because the weak reference can become nil if the object is deallocated. Unowned doesn’t do that; it just asserts. But it’s nice to be able to clear out such a reference, or initialize it to nil to begin with.

I understand that. I think you missed my point though. Please compare and contrast the following code under a multithreaded scenario:

// 'weak var x : T?' exists somewhere
// 'unowned var y : T?' exists somewhere
if let z = x { ... } // no crash in a threading scenario
if let z = y { ... } // may crash in a threading scenario

In a multithreaded environment and from the users perspective, the semantic difference between optional unowned and weak doesn’t add value. If anything, optional unowned is futile.

That’s true of just loading from a non-optional unowned variable as well, though.

The purpose of unowned is to be a non-owning reference that (1) can be dynamically invalidated and (2) alerts you when you make the semantic mistake of using the reference after it’s been invalidated. Not everybody appreciates the value of that, but the people who do frequently want to be able to have a nullable reference, for all the general reasons that a reference can be nil — order of initialization and so on. This isn’t something that we just dreamed up, it’s a frequently-requested extension, the lack of which really cripples unowned.

1 Like

Thanks for the feedback. I’ll try and test that as a part of the work I’m doing. I’ll leave the flipping of the actual policy decision until later.

Hi @John_McCall,

It appears that WeakClassExistentialTypeInfo does not override the extra inhabitant methods whereas AddressOnlyUnownedClassExistentialTypeInfo does. Is this intentional? Whether it is intentional or not, they behave differently. Which extra inhabitant behavior is the right default for address only existential reference storage types?

Dave

[EDIT] – Shift focus away from specific extra inhabitant methods.

Hmm. I think both should be treating the weak/unowned component of the value as opaque, so any spare bits should come from the first protocol witness table pointer (if there are any).

I tried to do this a long time ago when the project was first open-sourced, but got bogged-down in IRGen long enough that I had to shift focus to other things.

I’ve always thought it’s a bit of an arbitrary limitation. It would be great to lift it!

It is just a single-line policy decision/change on the aforementioned branch.

1 Like