While you'll hopefully get more direct answers from the compiler team members that visit these forums, particularly regarding specific examples (e.g. improved latency for cache line sharing re. reference counts), a few broader or tangential thoughts:
Old man yells at cloud
I wouldn't get too excited about that ACM article - it's a little misguided with its complaints and conclusions. Mostly it's just lamenting that there exists real-world work which is inherently serial and/or branchy.
The architectural debate (brainiacs vs speed-demons, also the related brawny vs wimpy) it's fighting is many decades old and largely settled right now (particularly because of Apple's ARM core designs which bucked convention by going heavily "brainiac" in exactly the space where conventional wisdom said that was insane, "mobile", and frankly embarrassed the rest of the industry with their resulting real-world performance and efficiency).
It also points out that C is an unsafe language (in Swift's sense of the word), which is a surprise to nobody ever. It sounds like the author just really doesn't like C, for their own personal reasons.
Compiler codegen optimisation
A lot of codegen optimisations are implemented by the CPU vendors - and architecture vendor, in ARM's case. So a lot of performance improvements for Swift are inherited generically through improvements in LLVM, many of which come from Arm (the company) or even other ARM licensees (e.g. Ampere). e.g. Swift's ARC is ultimately just atomic compare and swaps (or equivalent) which is a more fundamental pattern and performance-significant to a lot of code, so it gets plenty of focus in both compilers and hardware design irrespective of Swift's influence.
Sadly (for us on Apple platforms), Arm seem to still be putting a lot of effort into optimising ARM codegen in gcc (e.g. New performance features and improvements in GCC 12), which is essentially a waste of time for us. You can't really fault them for that - they have to follow the world's compiler choices, and gcc is still widely used (especially in certain domains that Arm has been heavily courting in recent years, like HPC).
Nonetheless, Arm have been steadily transitioning their focus to LLVM over the last decade, driven in no small part by Apple and some notable hyperscalers (e.g. Google). See for example What is new in LLVM 15 and What is new in LLVM 16. Their official compiler toolchains, for example, are now based on Clang+LLVM rather than gcc, although that is pretty recent and even some people at Arm haven't caught up with that news (e.g.). (they still maintain a GCC toolchain too, for whatever reason)
So we can look forward to increasing focus on LLVM from Arm, that will benefit Swift.
ARM is bigger than Swift
One of if not the dominant point of the ARM architectures are that they are enforced across all implementations. Arm certainly won't provide you any reference implementations that aren't fully compliant with the relevant ARM architecture, nor even can you buy an ARM architecture license and then make a non-compliant implementation (or at least, not publicly - what you do in the privacy of your own datacentre is your business, I suppose). Compliance means implementing the ISA as-is, without random implementation-specific instructions or the like.
This is (IMO) a big blessing, since it means portable binaries, reusable codegen in compilers, and even more broadly predictable semantics up in the higher-level languages themselves.
But, it does mean you can't do some hardware optimisations that you can with e.g. RISC-V, which distinguishes itself from ARM by saying "go nuts" re. architecture and [lack of concern for] compatibility. Arm wouldn't allow an "objc_msgSend" instruction to be added to any implementation, for example (unless they chose to incorporate that into their actual architecture, which they won't - I think Jazelle burnt them too bad, among other reasons). If for argument's sake Apple were using RISC-V (or any other fully proprietary architecture) then they could.
It's interesting to think about truly Swift-customised CPUs, and there certainly is some precedence - the aforementioned Jazelle, among several other attempts at "Java hardware" - but it seems highly unlikely. Though more targeted architecture additions are possible - see for example the ARMv8.3 'enhancement' that added instructions specifically for dealing with JavaScript's horrific numerics system.