I've been looking at a performance problem we've been seeing in Arm32/Linux since switching to Swift 5.3 (from 5.1). This is in an environment where it's cross-compiled via Yocto/bitbake, and we're also working on upstreaming the
meta-swift layer we created. We've been cross-compiling this same way with 5.1 in the past.
I've bee struggling a little with nailing it completely down, but it looks like libswiftcore is holding the bag at the moment. When we run our executables via
strace we see an astounding number of system calls of the form:
futex(0xb6f608a0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 Some of these system calls take many seconds to complete.
We have been able to collect time profiles of the process, and the hottest function of the entire program is
swift::RefCounts<swift::SideTableRefCountBits>::incrementUnownedNonAtomic(unsigned int) (at about 6% of all samples). I suspect my sample set doesn't include the futex system call at all, though it is present in strace output.
I'm hoping that someone has an idea about this section of the code, what might have changed, or something I could use for a breadcrumb to start working on.