[Discussion] `#if hasSymbol(Module.Symbol)`, a compile time check for a module symbol's availability

0x41c · June 18, 2022, 4:09pm

Description

The hasSymbol compiler directive aims to be a viable alternative to availability where it has not been explicitly attributed, or where a symbol is available in a library on one platform, but not available on another.

There are many instances where you would want to check the symbol availability at compile time. One good reason to do this during compile time is binary size. Any code inside the #if block wouldn't be compiled if the symbol wasn't available. If the symbol has been marked with @available, it would take that into account; But it wouldn't replace the #available check where there is no deployment target. Since it also takes into account whether the module is importable in addition to the symbol check, it makes it trivial to use.

Since this also works where symbols are not attributed with @available it also makes it the go-to for platforms outside of apple as attributes like @available(Linux, *) do not work/exist.

Example

Lets say you want to write some cross-platform code to access the System libraries FileDescriptor.OpenOptions.sharedLock symbol. You can't access this symbol on linux, but you can on apple-darwin systems. Now, you could check for the symbol like this:

#if os(iOS) || os(macOS) || os(iPadOS) || os(watchOS)
// Access symbol here
#endif

And this is ok. There's no clear alternative to this as #if canImport(System) would be true on both platforms. But there is more than one trade-off here. The first one is that if in the future Linux gets the sharedLock symbol, you now have to remove the check to have the code on at linux. The second trade-off is readability. There has been an increase in groans associated with using an os-chain like this one.

Using "`#if hasSymbol(...)`"

The example changes into a symbol-specific check that reflects whether the module and the symbol exists on the target platform.

#if hasSymbol(System.FileDescriptor.OpenOptions.sharedLock)
// Access symbol here
#endif

From reading the check you immediately are able to figure out why the check is there. It clarifies the ambiguity of the alternatives and is symbol specific which confirms whether it will be there like the developer intended it to be.

Resolving "What if there are two symbols named X?"

Well, the only way for both symbols to have the same name, is if they are function symbols, or else they wouldn't have the same hierarchy. Therefore, a selector-like autocomplete would be necessary for that.

Other ideas

As this is just a pitch, there are still things that can be improved in the way this is implemented. Feel free to critique and help make this idea even better.

Summary

This directive will be checking a few things to determine whether it resolves to true:

Symbol scope. (Is the symbol private from the access location?)
Target version. (Is the symbol available on the target version or future versions?)
Platform. (Does the platform support the symbol trying to be accessed?)
Symbol existing. (Does the module even export such a symbol?)

Serena · June 18, 2022, 4:15pm

Good idea, I feel.

But I think you'll have to list what is checked by #if hasSymbol, ie:

Scope availability
Version / Platform availability
etc etc

0x41c · June 18, 2022, 4:27pm

I’ve added a list, thanks!

xwu · June 18, 2022, 4:49pm

It is to be noted that you are proposing a compile-time check, unlike if #available, which is a runtime check (note how it’s spelled if and not #if).

On a platform where Swift has a stable ABI and supports library evolution, a symbol not available at the time your app is compiled may be available at runtime.

For that reason, there’s some subtlety here that causes misalignment between the user-expressed intention and the actual behavior of the feature. It’s likely (indeed, it’s strongly implied by how you explain the motivation for this pitch) that, when a user writes such code, they want to express: “If a symbol is available at the time this code is run, I want to use it.” However, as it stands, it will not matter at all whether the symbol is actually available at runtime.

I’d imagine that one of the most common ways in which users might want to take advantage of this feature would be to polyfill missing symbols. That is, if there is no symbol Foo.bar, then define Foo.bar—allowing users to extend others’ library types like this is a tentpole feature of Swift.

How would you propose the feature to behave in such a (likely, IMO) circumstance? Would hasSymbol evaluate to false in another file in the same project if that file is compiled first but true if the polyfill is compiled first?

Jumhyn · June 18, 2022, 5:06pm

I don’t think this quite captures #available—the symbol in question is available at ~~runtime~~ compile time (the compiler must know how to emit code that calls it), but the compiler knows that it is only conditionally available at runtime.

IOW, #available doesn’t support introspection into “can this symbol be looked up?” Rather, you must be compiling against an SDK for which lookup succeeds, but with an availability annotation explaining the availability conditions. In cases where you’re using a bunch of os directives to winnow down to just the platforms which define some symbol, I do see some value in allowing users to express that semantic intent more directly.

That said, I agree with this intuition:

And I agree that in such a case hasSymbol becomes a bit messy. One way to resolve this: once we have a fully-qualified name syntax for unambiguously referring to a specific declaration even in the face of shadowing between modules, we could require that the argument to hasSymbol be a fully-qualified name.

0x41c · June 18, 2022, 5:19pm

Yes, I agree that should be noted

I agree this definitely should be noted. I hope I made this clear in the syntax of the pitch. Ex: "(#)if hasModule".

I also agree with you here as well, however, the purpose of this pitch was to also be an alternative to @/# available on non-apple platforms. I believe focusing on that primarily will help us come to a resolution for using this on apple platforms in the future.

Since this is a preprocessor directive (I apologize if I unintentionally mislabeled it as a compiler directive, my bad) the outcome of the call should be resolved before any compilation is done. The order in which it gets resolved should be inconsequential in this context unless I'm wrong in which part of the compilation this would be resolved.

I believe that would benefit this directive greatly and makes sense for this situation.

This is part is what I was alluding to in the proposal and I'm happy you brought that out.

xwu · June 18, 2022, 5:21pm

Sorry, I’ve clearly failed to be clear in my meaning. I didn’t mean to suggest that #available supports such introspection. As you describe, if #available is indeed a runtime toggle between two options both understood by the compiler.

My point though is that the motivation for this pitch must consider that symbol availability at runtime is likely what the user will want to know, which does require introspection support for ABI-stable platforms, and neither any current facility nor what’s pitched here delivers that.

Swift has no preprocessor as such.

0x41c · June 18, 2022, 5:37pm

I believe that indeed is false. The scope encompasses whether or not the symbol is able to be looked up as was mentioned before by @Jumhyn. This lookup has nothing to do with runtime availability. Anyone looking for runtime availability will want to use #available.

Ok, that's fair. Would there be any implications to implementing this as a sort of pseudo-preprocessor so that it's resolved before much compilation takes place?

Jumhyn · June 18, 2022, 5:46pm

I don't think that Xiaodi is unclear on the proposed semantics. Rather, I believe that the concern (please correct me @xwu if I'm wrong ) is that a user attempting to compile against a platform X where Foo.bar is undefined will intuitively reach for this hasSymbol feature, mistakenly believing that it has the property that the code inside the #if will be executed if X eventually provides the Foo.bar symbol. So there's a risk that by providing this functionality, we'd be leading users down a path where their code appears to do something that it doesn't actually do, while the status quo of #if os(X) || os(Y) || os(Z) is verbose, but potentially clearer about future behavior.

0x41c · June 18, 2022, 5:50pm

Yeah, you're both touching on a good point.

The thing is that they wouldn't completely be wrong. What they're assuming would be true, but for it to do what they expect, they'd have to recompile. It's the same as with linking. If the symbol appears in a libraries public interface, the binary would have to re-link with the symbols. With #if os(X) || os(Y) however, they would have to recompile their code and remove the condition, where with hasSymbol the former is only necessary, and it helps with support for other operating systems without having to be verbose about it.

Jumhyn · June 18, 2022, 6:02pm

This is also a good point, and a large part of the value that I see of allowing users to express the hasSymbol intent directly. I'm also (personally) less worried about the potential confusion for a couple reasons:

Swift's #if has quite clear semantics as a conditional compilation mechanism, so while it's possible that there could be some initial confusion around hasSymbol I expect it to be quite easy to learn and remember the actual semantics.
To the extent that this feature would be used to polyfill symbols on certain platforms, the potential downside of a user misunderstanding the feature seems quite low—when X eventually ships Foo.bar, the program will continue to use its own definition, which is presumably close enough in functionality that it would have been fine to have a transparent shift to the new symbol anyway.
For users that end up in this situation, there's not really an alternative that would provide the semantics they want anyway. Users wouldn't be led to hasSymbol, thereby missing the hasSymbolButUpdatesWhenThePlatformDoes which implements the semantics they actually want—the best we could do is tell users 'what you actually want is impossible'.

tshortli · June 18, 2022, 6:54pm

Hi Corban, I'm actually in the midst of researching adding this exact feature to the compiler as part of a suite of tools for ad-hoc evolution of libraries so it's nice to you see that you are thinking about this problem as well and have use cases for it that are slightly different than mine; my goal is somewhat different but it is compatible with the design you have described.

In an environment where an SDK is evolving on a very frequent basis (e.g. daily), the current availability tools in Swift are cumbersome to use. In order to use @available and if #available the SDK and runtime would need very fine grained version numbers that change with each SDK and runtime release. Additionally, developers working on a common code base may be using a variety SDKs and runtimes, which presents a challenge: some developers may be integrating with APIs that were introduced into the SDK very recently, while others would like to keep using older SDKs to avoid disruption.

To give developers flexibility in this environment I think there are two capabilities that would be useful for the language to have:

To be able to query at build time whether a particular symbol is present in the SDK (this pitch).
To be able to weakly link symbols that may not be available at run time and query for their presence at run time before using them (essentially what @available provides, but on a more ad-hoc basis that doesn't require the library to be annotated with fine grained version numbers).

I do not want to push for an expansion of the scope of this thread by delving too deeply into the potential runtime query capability, but I did want to mention it as a bigger picture that I see this fitting into and I'm actively researching.

Allowing any declaration to be unambiguously identified in the predicate is one of the of the bigger problems to solve and is relevant to both the runtime and build time queries. Can you elaborate more on how you were imagining the "selector-like autocomplete" would work? It would be nice if this capability were not limited to use in this preprocessor context and could allow for unambiguous references to unapplied functions, for instance.

0x41c · June 18, 2022, 8:27pm

Hey Allan, It is very exciting on my end to hear that this is an idea that has been under active consideration. Made my day!

My opinion on this is that a runtime version of this functionality would be not necessarily a separate feature, but an adaptation that could work hand-in-hand with this pitch. Semantically, it almost makes sense that it could be determined by where the # symbol is located. That already implicitly determines whether something is a runtime check or a compiler check. So, at a high level, it might look like this:

if #hasSymbol(...) {
  // This gets compiled in, and allows you to use the symbol
} else {
  // This still gets compiled in, but is only called if the symbol is unavailable during runtime
}

Now addressing the auto-complete or structure of the symbol being passed in, I'm assuming there would be more than one format depending on the level of ambiguity the symbol requires to reference.

For example, let's first look at specifically the way(s) it could be passed into the compiler directive.
If the module the symbol is defined in is importable, the autocomplete could supply the names of the symbols (regardless of availability) for the developer to use. By default that would result in something like this:

#if hasSymbol(Module.some.ambiguous.symbol)
// Names are hard, autocomplete would help in this case
#endif

If the module was unable to be imported though, there would be no autocomplete. That would suck, especially if the developer gets the name wrong, but there is usually adequate information about the symbol that should lead the developer along the right path.

In another circumstance though, you're trying to check if you can access a function symbol. This can be tricky because of overloads. An example of a tricky situation would be something like this:

func someFunction() -> Void
func someFunction(_ parameter: Any) -> Void

These two can be disambiguated from each other by using the same syntax that #selector uses.

// func someFunction() -> Void
#if hasSymbol(module.someFunction)
#endif

// func someFunction(_ parameter: Any) -> Void
#if hasSymbol(Module.someFunction(_:))
#endif

Unfortunately, though, we can be met with two functions with the same signature that only differs in the return value:

func someFunction() -> Int
func someFunction() -> String

The only way to disambiguate the use of these functions is to determine them from their use context which would be unavailable to us as a compiler directive.

This is where the second possible alternative way to pass a symbol into hasSymbol comes in handy. By passing the symbol in as a mangled name, all the type information is able to be used for disambiguation:

// func someFunction() -> String
#if hasSymbol("$s6Module12someFunctionSSyF")
#endif

An alternative to this is passing in the return type as a second parameter, but for implementation purposes that might not be the best idea. Nevertheless, it's an option to consider:

#if hasSymbol(Module.someFunction, String)
#endif

To someone reading this without knowing what the parameters are, it may not be initially clear why "String" is passed in as a second argument.

For the runtime counterpart, this unclear use of a type can be clarified nicely. Here's an example of the what it may look like for #hasSymbol to disambiguate a function:

if #hasSymbol(Module.someFunction, withType: (() -> String).self) {}

For unapplied functions, this could still be used because the type information can now be passed into the check. Inside the if scope, the function could then be accessed as if it was available to the developer the entire time without needing it to be explicitly passed into the scope.

xwu · June 18, 2022, 8:46pm

0x41c:

Unfortunately, though, we can be met with two functions with the same signature that only differs in the return value:
func someFunction() -> Int
func someFunction() -> String
The only way to disambiguate the use of these functions is to determine them from their use context which would be unavailable to us as a compiler directive.

Type annotations and type coercion (someFunction as () -> Int) would work fine to disambiguate these.

I think the question of ambiguity becomes trickier when one considers how to distinguish Foo.bar as implemented as an extension in module Bar versus the exact same member with the exact same signature implemented as an extension in module Baz.

0x41c · June 18, 2022, 8:48pm

The spam filter just removed my post, I hope to see it brought back soon.

Jumhyn · June 18, 2022, 8:49pm

Yeah, this is the idiomatic type-based disambiguation strategy in Swift. It did occur to me whether there could potentially be some ordering issue here, implementation-wise (can conditional compilation conditions depend on the type-checking machinery?), but I can't say for sure. Same goes for the mangling-based strategy, in the other direction—does the compiler frontend have knowledge about the presence or absence of specific mangled symbols?

xwu · June 18, 2022, 8:49pm

That's most unfortunate, seeing as I just replied to it, clearly providing a signal that it's not spam. Mods?

0x41c · June 18, 2022, 9:05pm

So, I also felt this concern when explaining disambiguation methods and decided that this strategy wouldn't work because of the aforementioned reasons. Because of that, I'm not entirely sure what the best strategy would be in the case of the compiler directive.

The only solution that came to mind resembles @_silgen_name because it uses the mangled name to look up the signature and symbol. Instead of disregarding the name from the signature like @_silgen_name (If the function interface doesn't match up with the type signature, there is no warning or error), it would use the type information and cross-reference that with known symbols matching the name and type signature provided. It's not the most ergonomic solution and requires knowing the mangled name, so I discredited its use, but I provided it as an example anyway.

0x41c · June 19, 2022, 7:55pm

My post has been reviewed and restored by a moderator if anyone still has residual thoughts about it. I'm beginning to work on a proposal along with an implementation. I also want to thank you all for the questions about it and your thoughts on the matter!

tshortli · June 20, 2022, 5:01pm

0x41c:

My opinion on this is that a runtime version of this functionality would be not necessarily a separate feature, but an adaptation that could work hand-in-hand with this pitch. Semantically, it almost makes sense that it could be determined by where the # symbol is located. That already implicitly determines whether something is a runtime check or a compiler check. So, at a high level, it might look like this:
if #hasSymbol(...) {
  // This gets compiled in, and allows you to use the symbol
} else {
  // This still gets compiled in, but is only called if the symbol is unavailable during runtime
}

Yeah, this is what I would expect it to look like to use the runtime query. I think the tough part of designing the feature, though, will be working out how type checking might work. It probably ought to behave similarly to the way existing availability checking works where the compiler knows which declarations are potentially unavailable and requires you to prove that they are available before using them to avoid crashing. But then we will also need to consider whether it should be possible to limit which declarations are considered potentially unavailable and also whether additional syntax should be available to indicate that entire declarations require a symbol to be available (think about how you can use @available on an entire struct to limit its use to OS versions with a certain API that is fundamental to its implementation). There's a lot of design space to consider and I think it probably deserves a separate thread as the concepts are complementary but not entirely dependent on each other.

The type coercion approach does seem like the right syntax for consistency but the implementation feasibility question is an important one that I don't know the answer to yet.

I also considered mangled names as an option for identifying declarations when I was thinking about approaches for this but I believe that using them would lead to a poor developer experience since deriving the mangled name associated with a declaration would be a hassle for the writer and also isn't very readable. The topic also raises another question, which is whether this preprocessor query is semantically meant to match symbols in the library's binary or to match Swift declarations (some declarations yield more than one symbol). I think matching declarations is a more intuitive model overall.

If we needed to abandon the idea of matching declarations due to fundamental layering problems, one option to consider would be to allow the definition of a kind of preprocessor condition directly in source, similar to #define in C-like languages. I'm imagining something similar to what you get by passing a -DSOME_CONDITION flag to the compiler but declared in the source instead. Lookup would probably need to be scoped by module to avoid collisions. This is not my first choice for a lot of reasons (one being that discussion might immediately get bogged down by the question of whether Swift ought to have hygienic macros instead) but it feels a bit cleaner to me than using mangled names.