Pitch: `@globalConstructor`

Keith · December 7, 2022, 6:57pm

Hey everyone!

I've written a short pitch about adding a new attribute called @globalConstructor which allows you to mark functions to be immediately called when your binary executed, or a dylib containing the function is loaded. This mirrors the behavior of __attribute__((constructor)) in Objective-C/C++/C, and #[ctor] in Rust.

The pitch text is here and you can also see the initial implementation here.

I'm excited to hear any feedback that folks have on this, thanks!

kylebshr · December 7, 2022, 7:07pm

I think this is an excellent addition - some of the frameworks I've helped maintain have had to use Objective-C just as you've described in the pitch in order to set things up correctly before being used. This is pretty annoying to do in a pure swift framework, since you have to now deal with bridging when you wouldn't have otherwise.

John_McCall · December 7, 2022, 7:42pm

This is something we've been trying to avoid in the language. There's a couple reasons for that:

The first is that global constructors have an inherent ordering problem. Global constructors often end up having dependencies on other global constructors, and solving that well is sort of a pit of complexity, especially without elaborate toolchain support.
The second is that global constructors are inherently very brittle because by design they precede all normal user code (other than other global constructors, as discussed above). In particular, global construction very tightly constrains how the subsystem can be configured — it has to be configured by things like environment variables because normal command-line arguments are usually not available. And then that means the global constructor is very heavyweight, which feeds into the next problem.
The third is that global constructors are eagerly run, which means they impose substantial launch-time penalties on the program even if the subsystem they're part of it is never used. Subsystem designers have a tendency to think of their subsystem as being centrally important to the overall program, and so anything that makes the subsystem simpler or more performant is worthwhile. This is often much less true in practice.

These arguments are why Swift has always pushed designs where initialization is lazy on first use. What we have is not perfect — in particular, it doesn't completely solve the configuration problem — but I think it's a better basis for the language than phased global construction.

I'm not going to say that your pitch is a non-starter, but these are the issues you need to address before I think we can consider it.

beccadax · December 7, 2022, 9:39pm

If this were to go in, I would prefer to see the constructor priority redesigned to state dependencies rather than just list a number. Contributors who've been hanging around Evolution since 2016 might remember that operator precedence was originally expressed with raw integers in a similar fashion, and SE-0077's introduction of precedence groups which were described relative to each other dramatically improved the situation. If we took on this feature, I would hope to see something similar here—perhaps you would say things like @globalConstructor(after: first) func second().

But that's burying the lede, which is that I agree with John. I'm very skeptical of global constructors or other features that automatically run code on launch, and I believe we'd be better off designing lower-overhead features for specific important use cases. For example, your proposal mentions registering plug-ins; that job is probably better done by adding a special metadata section to binaries that lists conformances for protocols that have requested registration, and providing APIs that allow for incremental scanning of this metadata after new dynamic libraries are loaded into the process. Similarly, your library configuration use case could be better handled by repurposing the existing lazy global initialization mechanism to run configuration code on the first use of a library. Mechanisms like these delay the setup costs until they actually need to be incurred, which shifts those costs away from busy process launch phases and often avoids incurring them at all.

ksluder · December 8, 2022, 4:39am

I can confirm from personal experience that +load (the Objective-C spelling of a global constructor) is an attractive but dangerous time to do any sort of registration-like work. Objective-C tries to solve the dependency problems at runtime, but this leads to extremely fragile systems where a change in one +load method can rearrange the entire ordering of +load invocations in a process, often exposing reentrancy issues.

Keith · December 8, 2022, 9:08pm

Thanks for the feedback everyone. My intent with this definitely wasn't to re-introduce a variant of +load, although I understand it could be misused like that. The intent was much more around providing a way for the library use cases I mentioned to run independent initialization code especially with the DYLD_INSERT_LIBRARIES type use case.

Overall I wouldn't expect this feature to make it into 90% of codebases, but in the case that you think it's the best way to solve a problem, it feels much nicer to provide this directly vs requiring users to fall back to calling Swift from a C function.

One option for enforcing this ideal could be to remove the prioritization all together so that it strongly discourages inter-dependent initializers, since there would be no way to order them even if you tried (although that could also just encourage folks to rely on the undefined ordering as well). What are your general thoughts on that or other options to avoid the pitfalls of +load?

John_McCall · December 8, 2022, 9:25pm

The problem with inter-initializer dependencies is that people usually think they don't have any, and then it turns out they do (or they evolve them over time). And then if you don't have a direct mechanism for resolving them, people have to invent their own ways to do that, like relying on compiler/linker order in subtle and undocumented ways.

I agree with Becca: it would be far better to identify and support the reasons people want things like this (e.g. to support implicit discovery of services at runtime) than to support global constructors.

tgoyne · December 8, 2022, 9:27pm

This is something which would be nice to have totally independently of the dynamic plugin-loading use case. We're currently using objc_copyClassList() and scanning the result of that to check each type for subclassing and protocol conformance, which has some well-known drawbacks.

Joe_Groff · December 8, 2022, 11:36pm

I agree with both of you in principle, though at the same time, there are existing systems where global constructors are the prescribed way of doing things, and/or the only way to force the existing system to behave a certain way. New systems should be designed in ways that don't need them, but we don't always have the luxury of getting to build a new system.

As food for thought, could we use concurrency isolation checking as a way to enforce that a global constructor has no external dependencies? For instance, if the constructor had to be performed in a single non-async method on an actor, that seems like it'd prevent the constructor from accessing any mutable state outside of the actor itself.

John_McCall · December 9, 2022, 12:16am

I don't think this is true. An existing system isn't going to know how to use a Swift-specific passive registry, so it probably does require active registration, yes. However, it almost certainly doesn't require that active registration to happen prior to main, and there's almost certainly some definable point in the program that precedes all uses of registrations. It should not be difficult to trigger a passive registry to be turned into active registrations at that point if that's what the system needs.

More generally, offering a facility in the language is a statement that we think it's okay to use, and if we don't think that, we shouldn't offer it. If it's absolutely necessary to use global constructors in some case instead of registering things at the start of main, okay, people can use C to get that effect. But that would be a poor architectural design that is probably already causing problems for pure C clients, and pushing people towards designs based around lazy and/or passive initialization is the right thing to do.

If all uses of global mutable state had to be actor-constrained, then you're right, a global function that was not actor-constrained would not be able to direct or indirectly touch any global mutable state. However, things like global registries are usually locked, which would subsume an actor constraint.

Zhu_Shengqi · December 9, 2022, 3:29am

+1 for the direction of providing global constructor support for library authors. This would help with the logic decoupling between hosting app and the libraries. In Objective C we have +load to do some one-time setup like method swizzling, but I can't find the equivalence in Swift.

Joe_Groff · December 9, 2022, 3:42am

A counterexample that comes to mind is Swift's own runtime on non-Apple dynamically linked platforms, where other platforms' dynamic loaders don't have any public API akin to Apple's _dyld_register_add_image_func that can be used to lazily register load-time triggers, and we rely on a single static constructor in the Swift equivalent of crt0 to register images with the Swift runtime. (We could still argue that we set up the global constructor so you don't have to.)

Joe_Groff · December 13, 2022, 6:11pm

Speaking of registration use cases, @xedin just pitched a cool proposal for user-defined, runtime-discoverable metadata attributes:

It'd be interesting to hear from you all how many use cases for global constructors could use this functionality instead.

Keith · December 21, 2022, 5:39pm

Thanks for the discussion folks, I'll close out the PRs

kabiroberai · December 21, 2022, 10:20pm

Food for thought: if I understand this proposal and the Custom Metadata Attributes proposal correctly, global constructors and runtime metadata attributes are in theory isomorphic, but IMO the latter would be better as a language feature, with the former being implementable as a library instead.

Specifically, a third-party Swift Package could declare an @GlobalConstructor runtime attribute, and include its own __attribute__((constructor)) C function that gets the list of GlobalConstructors and invokes them one by one. The converse is also true, in that runtime attributes can be "registered" in constructors, but that would force running code at load-time whereas baking runtime attributes into the language would allow laziness or eagerness as desired.

Another reason to favour the library-based approach is that it would prevent Swift having to define global initialization semantics as a part of the language, which is probably a good idea because, in addition to what @John_McCall said about implicit endorsement, it also seems like it'd be hard to define the semantics uniformly across platforms — think platforms like WASM where (afaict) constructors can barely do anything at all.

All in all I agree that global constructors can be super useful in some scenarios, but it seems they'd be more flexible and versatile as a library implemented on top of the Custom Metadata proposal, rather than baked into the language.

tourultimate · December 28, 2022, 9:09pm

This sounds like something I need. Can you clarify how this is done with some details on "global initialization mechanism to run configuration code on the first use of a library"? Thank you