Differences in C++-based LLVM IR depending on the target

Hey,

I noticed that Clang produces a slightly different LLVM IR for a C++ program when running with the target "arm64-apple-macosx13.0.0" than e.g. ""x86_64-pc-linux-gnu".

This confuses me since I was under the expression that the only changes are "Swift specific changes" and I never read about any changes to the IR due to the target. Especially, since I ran in non optimized mode and thus we should just run the very first bit of the compiler middle end and I expect all platform specific changes to be at the end of the middle end or more likely in the platform specific back ends.
Thus, I'm a bit confused by the differences. Is this behavior documented somewhere and can I get an overview of the differences without comparing both code bases?

Any help would be greatly appreciated :slight_smile:

Code for reproduction:

C++ Program:

class MyClass {
    public:
        MyClass() {

        }
};

int main() {
    MyClass c = MyClass();

}

Compiler Info: Apple clang version 14.0.0 (clang-1400.0.29.202)
Used flags in addition to the target flag: -emit-llvm -S -std=c++20 -fno-discard-value-names -O0

The difference is in the return type of the generated constructor:

For target "arm64-apple-macosx13.0.0"

define linkonce_odr %class.MyClass* @_ZN7MyClassC2Ev(%class.MyClass* returned %this) unnamed_addr #2 align 2 {
entry:
  %this.addr = alloca %class.MyClass*, align 8
  store %class.MyClass* %this, %class.MyClass** %this.addr, align 8
  %this1 = load %class.MyClass*, %class.MyClass** %this.addr, align 8
  ret %class.MyClass* %this1
}

For target "x86_64-pc-linux-gnu"

define linkonce_odr dso_local void @_ZN7MyClassC2Ev(%class.MyClass* nonnull dereferenceable(1) %this) unnamed_addr #1 comdat align 2 {
entry:
  %this.addr = alloca %class.MyClass*, align 8
  store %class.MyClass* %this, %class.MyClass** %this.addr, align 8
  %this1 = load %class.MyClass*, %class.MyClass** %this.addr, align 8
  ret void
}

I don't think there's a way to track such differences. While LLVM IR may be portable to some degree, it doesn't have to be.

While I don't have full knowledge of the example you give, I can provide another one I'm more familiar with. Calling conventions on most platforms allow calling a function by passing a number of arguments that's less than the number of parameters it takes. Swift relies on this for throwing functions, which (under the hood) take a pointer to an error as an argument. Thus, when you a pass a non-throwing closure to an argument that expects a throwing closure, everything works fine on such platforms.

This won't work when targeting WebAssembly though, as this platform requires that the number of arguments passed by a caller is always equivalent to the number required by a callee. On an argument mismatch your code will just crash, as required by the Wasm spec. For this reason, for Swift to support WebAssembly, it has to emit different LLVM IR for this different calling convention. I wouldn't be surprised if there were more platforms that Swift doesn't support yet, but that need similar adjustments.

Also, see this SO question and its answers:

If you're very careful, you can use LLVM IR across multiple platforms, but not without doing significant additional work (which Clang doesn't do) to abstract over the ABI differences.

and

Given an IR file, can I be sure it could compile to my target?

You can not assume an arbitrary IR file will always be cross-platform, as there are things in a given file that might not be platform-independent. The most notable example is that the IR can contain actual assembler sequences (via module-level or inline assembly segments), but there are other examples - e.g. usage of target specific intrinsics or calling conventions that are only supported on some targets.

Can I generate an IR file that is guaranteed to compile on all targets?

I don't know, but I believe you can, especially if you avoid specifying things like inline assembly, calling conventions, required / preferred ABI for types, etc. It can affect the optimizations the compiler will perform, though.

2 Likes

This question is really just off-topic for the Swift forums; you are using a C++ compiler to compile C++ code. You might try LLVM's Discourse.

I can give you a quick answer anyway, though, just to be helpful. As a general rule, as Max says, LLVM IR is not an abstract representation of source code and is not portable between targets. Type layouts, high-level ABIs, and object-file formats can all vary between targets and result in completely different code patterns in IR. The difference you've identified is that the ARM C++ ABI specifies that constructors and destructors return this, which is not a standard rule of the Itanium C++ ABI that the ARM C++ ABI is based on. Linux x86_64 uses the base C++ ABI.

3 Likes

Hey @John_McCall

first of all sorry for making this seemingly off topic question here. To give you a bit more context in my research, I'm actually trying to apply LLVM based data flow analysis tools, which so far have just targeted C++-based LLVM IR, to Swift-based LLVM IR. Thus I compared the different constructs in the IRs to figure out if Swift creates something we need to specifically take into account.

Then stumbled across the platform specific differences when compiling "only" C++ with "arm64-apple-macosx13.0.0" as a target and just didn't understand where to pin-point the difference, so thank you very much for pointing out the ABI differences. I have do admit on the level I looked on it I just completely forgot that these differences have to reflect on the IR on this early state, my bad.

Thanks for your help!

Hey @Max_Desiatov

thanks for pointing me in the right direction. As I mentioned in my reply to John I was coming from different language (Swift, C++) same platform comparisons of the IRs and was just super confused at that point to find differences in an area which I just didn't expect. Thanks especially to also point me to the calling conventions :slight_smile:

1 Like

For anyone else who stumbles across this post, the observed behavior is documented in apple's documentation.
For the case mentioned above it clearly states

  • The ABI requires the complete object (C1) and base-object (C2) constructors to return this to their callers. Similarly, the complete object (D1) and base-object (D2) destructors return this. This behavior matches the ARM 32-bit C++ ABI.
1 Like