I imagine the issue is just that Atomic is generic so theoretically it may cause a metadata allocation if the generic argument is not statically known or if the compiler doesn't specialize the type. In reality, for Atomic it's almost always statically known and every operation on it is fully transparent and emitted into clients for the compiler to eliminate any generic shenanigans and lower these operations down to single atomic ops. Perhaps we should teach the compiler about this type for @_noLocks and @_noAllocation?
I would expect the current implementation to be able to handle a concrete use of a generic type like this, as long as it's actually getting fully specialized. We should check whether some generic part of the implementation isn't @inlinable, or the compiler is deciding it's unable to specialize and leaving behind metadata accesses. cc @Erik_Eckstein
The Atomic initialiser is not @inlinable but has the @_alwaysEmitIntoClient and @_transparent attributes which sounds like it should be enough but this is outside of my expertise.
Maybe? I don't think it has been spelled out explicitly and it would be something new. @_transparent usually does a good job at enabling optimisation in debug builds but I'm not aware what it actually guarantees.
It's my understanding that in a @_noLocks/@_noAllocations function that we're supposed to try to optimize out calls that require metadata instantiations even if the compiler wouldn't normally decide to (assuming that's fundamentally possible, with the functions in question being @inlinable or in the same module, of course).
Great news.
Please clarify what would happen to the code shipped (to the AppStore, etc) with -Onone (strange, but it happens). Will the individual function marked with @_noLocks be effectively compiled with -O while the rest of the code compiled with -Onone ? Or will it be compiled with "-Onone" (potentially taking locks)?
I believe that's what happens now (Erik might be able to confirm), but it wouldn't necessarily always be the case. We would probably ultimately want to run only lighter guaranteed transformations that eliminate potential sources of runtime locks without sacrificing debuggability of the code.
The issue is that the deinit devirtualizer is not yet turned on for move only types, so it must allocate metadata right now to call the destroy value witness function. In this example:
struct Hello: ~Copyable {
let counter = Atomic(0)
}
output.Hello.init() -> output.Hello:
mov qword ptr [rax], 0
ret
destroy value witness for output.Hello:
push rbx
mov rbx, rdi
lea rdi, [rip + (demangling cache variable for type metadata for Synchronization.Atomic<Swift.Int>)]
call __swift_instantiateConcreteTypeFromMangledName
mov rcx, qword ptr [rax - 8]
mov rcx, qword ptr [rcx + 8]
mov rdi, rbx
mov rsi, rax
pop rbx
jmp rcx
You'll see that Hello's initialization doesn't allocate metadata for Atomic, but it's only in Hello's destroy that it allocates it right now (this is resolved, but the optimizer pass is turned off like I mentioned).
Regardless of whether the devirtualizer is enabled, I don't think we ever serialize the SIL record for the deinit yet to make it available across modules. We would need to do that for @frozen public types as well for the devirtualizer to have the information it needs.