[Pitch] [Embedded] "Unicode" availability domain for APIs requiring the Unicode tables

I'd like to figure out how to make these features work together, because there are a lot of advantages to doing so. A package trait can be used to cause additional targets to get linked in, which nicely models what we want for the standard library: if you ask for the unicode tables via the trait, we depend on the target that includes those tables in the binary.

Spitballing a bit here, but if the answer for Embedded Swift were that Unicode was a trait on the standard library, then the standard library might end up with something like:

#if Unicode
@availabilityDomain(Unicode)
public var _unicodeDomain: Bool { true }
#else
@availabilityDomain(Unicode)
@const public var _unicodeDomain: Bool = false
#endif

i.e., we use the trait to decide between "can be available" and "is never available".

We would still need a separate way for a module to decide between "available" (one can only use Unicode APIs from inside @available(Unicode)) and "always available" (one can always use Unicode APIs).

We'd need some checking of how the Unicode domain is configured when importing a module:

  • A Unicode-never-available module can be imported by a Unicode- never-available module or a Unicode-available module
  • A Unicode-always-available module can be imported by a Unicode-always-available module
  • A Unicode-available module can be imported by anything

The standard library would be built in either configuration that can always be imported (Unicode-available or Unicode-never-available), which also means that we don't have to build it from a package for things to work: we can build with Unicode-available in the toolchain, and separately deal with linking in the Unicode tables (or not) based on whether the Unicode trait was provided.

The default for non-Embedded Swift would need to be that the trait is enabled and each module is set to "always available". Unfortunately, this does mean that for embedded, one would need to update each of the libraries you link to either "available" or "never-available" for the Unicode domain to work, making this a bottom-up rollout. We can probably be a little more lax about this domain specifically because the failure mode (a link error) isn't catastrophic.

FWIW, I don't think it's derailing the pitch at all. These features are solving overlapping problems and we should figure that out.

You are correct that availability domains still require that the code type checks. They move the "compiling out code" to a later place in the compilation pipeline: if you always-disable an availability domain, we type check but don't emit any code for anything that has that availability. For an optional feature that doesn't involve dependencies, I think that's a better user experience, because you get the same diagnostics whether the feature is enabled or not, without having to compile twice. But when there are dependencies---say, a module you can import or not---you need the #if that traits provide.

Yeah, with such a huge difference in code size cost between the two, separating them into UnicodeNormalization and UnicodeScalarProperties seems like the best course of action. That 500kb is big enough that one might even want to avoid linking those libraries in a non-Embedded static binary build (e.g., with the static Linux SDK).

They feel orthogonal to me, but I don't have a strong sense of why. When I rework the pull request to separate the domains, we'll see how often they end up being tied together within the standard library itself.

Doug

2 Likes