Targeting specific microarchitectures

wadetregaskis · January 17, 2024, 11:24pm

What's the correct way to target a specific microarchitecture in Swift; what one would use -march / -mcpu for with Clang?

swiftc has -target and -target-cpu parameters but they only accept regular target triples and the CPU architecture (e.g. arm64). respectively.

I looked at -Xcc, but according to the documentation flags specified that way have no effect on Swift code ("Pass[es] to the C/C++/Objective-C compiler").

Relatedly, does Swift take into account minimum deployment targets to derive suitably-optimal default microarchitecture targets, e.g. skylake / apple-m1 for macOS 14?

Max_Desiatov · January 18, 2024, 9:08am

Would you clarify what documentation you're referring to?

-Xcc options related to the LLVM backend do have an effect on Swift code, and IIRC -march and -mcpu are some of those.

ole · January 18, 2024, 10:29am

I learned this recently through @kubamracek's pull request:

This might not be obvious: -Xcc flags are typically only used to alter behavior of the Clang importer, but passing flags to Clang this way also works to specify LLVM target options like selecting a specific CPU architecture (-march, -mcpu, -mmcu), FPU unit availability (-mfpu), which registers are used to pass floating-point values (-mfloat-abi), and others.

wadetregaskis · January 18, 2024, 6:45pm

swiftc --help

Max_Desiatov · January 18, 2024, 6:56pm

It does say that provided options are passed to the C/C++/Objective-C, it doesn't say that these options have no effect on Swift code. In fact, the way C and C++ code is handled has direct impact on Swift code, as you're always using ClangImporter under the hood when building anything with swiftc.

scanon · January 18, 2024, 6:58pm

We wouldn't be able to infer skylake because of rosetta, which is basically equivalent to nehalem as far as ISA extensions go, IIRC.

wadetregaskis · January 18, 2024, 7:41pm

Well, it doesn't say infinitely many things; I think my interpretation of it is reasonable. I'm glad that documentation's wrong, though.

You mean x86_64 on arm64 Rosetta ("Rosetta 2", unofficially)?

Nehalem is ancient. It was a particularly important step up for Intel's microarchitectures, but, it is now a dinosaur. It's a weird omission to not at least provide some compatibility with newer x86_64 microarchitectures, even if e.g. the performance for AVX instructions is poor.

Ugh, I see now - per Apple's documentation:

Rosetta translates all x86_64 instructions, but it doesn’t support the execution of some newer instruction sets and processor features, such as AVX, AVX2, and AVX512 vector instructions.

Sidenote: What an absurd opening statement. Obviously it doesn't support all x86_64 instructions - two seconds later they provide examples!

The potential saving grace here is that in a fat binary Rosetta is of no concern. So it does complicate the build system a bit more, but in principle it should still be possible for it to default to a sensible x86_64 microarchitecture baseline iff an arm64 slice is being included anyway. I'm guessing swiftc doesn't actually handle the fat binary aspect, though, so this is therefore something the driver (e.g. Xcode) would have to cause?

It looks like it doesn't, though, based on the fact that I see noticeable performance improvements by simply adding -march=skylake even to a fat binary build.

Apple advise dynamic code selection to work around this, but to my knowledge the Swift compiler and standard toolchains (SPM & Xcode) have no support for this…? In the sense of doing it for me. I can obviously write manual code for dynamic selection, and duplicate implementations, but in the context of -march / -mcpu those are not applicable (and more to the point, writing SIMD code by hand is extremely difficult, so I'm not even going to try).

Finagolfin · January 19, 2024, 6:41am

No, you are likely correct, those -Xcc flags have no impact on Swift code generation. Those flags are passed to clang/clang++ for compilation of C and C++ files, and they are passed to the ClangImporter component of the Swift compiler, which is used to parse C/C++ header files but does no code generation, but I don't believe they affect Swift code generation.

I believe you are raising a genuine deficiency in the Swift compiler, that it just doesn't have flags to support such microarch targeting.

Max_Desiatov · January 19, 2024, 10:31am

They do affect Swift code generation.

You can see this in action by building for wasm32-unknown-none-wasm triple in the embedded mode with the latest nightly toolchain. Architecture-specific Clang flags will affect Swift code generation. For example, -Xcc -mmultivalue will allow generated Wasm functions to return Swift tuples on Wasm stack as Wasm tuples, instead of storing the tuple in Wasm linear memory and returning an address it. Same for other flags, you can control whether your generated Wasm code contains SIMD or atomic instructions: Clang command line argument reference — Clang 18.0.0git documentation

This works the same way for other architectures. I'm only using Wasm as an example here as I've spent enough time looking at disassembly of produced Wasm binaries with different combinations of these flags.

-Xcc flags are handled by ClangImporter, but they are also passed to the LLVM context that produces machine code for given LLVM IR, whatever that IR was generated from: Swift or any of the C family languages.

scanon · January 19, 2024, 2:55pm

No, they absolutely change Swift codegen. This is easy to check: Compiler Explorer

Finagolfin · January 19, 2024, 3:11pm

You're right. I've looked at some of the ClangImporter source in the past and submitted small modifications, but didn't recall seeing anything about codegen. I had not looked in IRGen, where it looks like the Importer is used to initialize the codegen options. I had only seen issues like this, where @ColemanCDA complained that the Swift compiler had no such flag.

I still think the Swift compiler should handle this config directly, but I guess it's good this back door currently exists for those who know where to look.

Max_Desiatov · January 19, 2024, 3:25pm

It isn't a backdoor, I see it as a designated way to customize it, given how integral C language family interop is to Swift. What would be the benefit of replicating every single relevant Clang option in swiftc?

Say Clang/LLVM adds, updates, or removes an option for your favorite architecture, and there's more than a dozen of architectures supported by LLVM now. What would be the point in maintaining that duplication in the swiftc codebase, with an inevitable time lag between Clang and Swift release cycles and a mismatch of available options in each caused by such duplication?

scanon · January 19, 2024, 3:30pm

It would be pretty nice to have more discoverable and less-verbose flags for these things. On the other hand, most targeting of specific extensions should be done at function granularity, not module or resilience boundary (and Swift doesn't really have an in-between notion of translation unit).

Max_Desiatov · January 19, 2024, 3:32pm

Right, something like __attribute__((target("arch=cortex-a75+nosimd"))) in Clang?

scanon · January 19, 2024, 3:33pm

Right. [SR-11660] Umbrella: function multiversioning and dispatch on CPU features · Issue #54069 · apple/swift · GitHub

scanon · January 19, 2024, 3:40pm

Realized I forgot to reply to the Apple Silicon side of this; yes, we have M1 as a baseline for macOS / arm64. E.g., you can unconditionally use Float16 when building for apple-arm64-macos11, and we will generate the ARMv8.2 half-precision arithmetic instructions.

Finagolfin · January 19, 2024, 3:50pm

It clearly has not been "designated," hence the confusion in this thread and the issue I linked.

I would hope the Swift compiler would not simply rely on the clang driver to set such important target info, and that the importance of C will fade with time.

Not every single option, just the important ones.

To optimize Swift codegen, you should be setting Swift flags, not old clang flags, particularly given that nobody finds clang's arch-targeting flags that great. If you want, you can just have the Swift compiler translate the Swift flag to the clang equivalent, if that works, like I now see -target-cpu has been doing from the beginning.

But we definitely shouldn't leave such important arch-targeting flags to an undocumented hack of passing it to clang.