Atomic with @_noLocks can cause metadata allocation or locks

I have the following code snippet:

import Synchronization

struct Counter: ~Copyable {
    let changeCounter = Atomic<Int>(0)

    @_noAllocation
    @_noLocks
    mutating func add() {
        changeCounter.add(1, ordering: .sequentiallyConsistent)
    }
}

This code reports the violation:

"Using type 'Counter' can cause metadata allocation or locks"

Since Atomic is quite new, I guess performance annotation need to take them into account somehow?

3 Likes

This seems something specific to Atomic. I am not able to reproduce this with my ~Copyable types (even if they are generic).

Also, you don't need to interact with atomic variable at all. E.g. this raises the same error:

import Synchronization

struct Counter: ~Copyable {
    let changeCounter = Atomic<Int>(0)
    var x = 1

    @_noAllocation
    @_noLocks
    mutating func add() {
        x = 2
    }
}

@Erik_Eckstein do you maybe know what is happening here?

I imagine the issue is just that Atomic is generic so theoretically it may cause a metadata allocation if the generic argument is not statically known or if the compiler doesn't specialize the type. In reality, for Atomic it's almost always statically known and every operation on it is fully transparent and emitted into clients for the compiler to eliminate any generic shenanigans and lower these operations down to single atomic ops. Perhaps we should teach the compiler about this type for @_noLocks and @_noAllocation?

3 Likes

Yeah, this is a common problem with generic-but-always-specialized-in-practice stuff.

1 Like

I would expect the current implementation to be able to handle a concrete use of a generic type like this, as long as it's actually getting fully specialized. We should check whether some generic part of the implementation isn't @inlinable, or the compiler is deciding it's unable to specialize and leaving behind metadata accesses. cc @Erik_Eckstein

1 Like

Yeah, using Compiler Explorer reveals calls to:

swift_getTypeByMangledNameInContext2

Example:

The linked godbolt is without optimisations turned on so no surprise but it also doesn't change if compiled with -O.

It looks like the initialiser is causing problems. Operations like wrappingAdd seems fine.

The Atomic initialiser is not @inlinable but has the @_alwaysEmitIntoClient and @_transparent attributes which sounds like it should be enough but this is outside of my expertise.

Shouldn't Atomic guarantee no metadata access without relying on optimiser?

1 Like

Maybe? I don't think it has been spelled out explicitly and it would be something new. @_transparent usually does a good job at enabling optimisation in debug builds but I'm not aware what it actually guarantees.

It's my understanding that in a @_noLocks/@_noAllocations function that we're supposed to try to optimize out calls that require metadata instantiations even if the compiler wouldn't normally decide to (assuming that's fundamentally possible, with the functions in question being @inlinable or in the same module, of course).

4 Likes

Is it considered ok if the @_noLocks code compiles fine in -O but fails to compile in -Onone ?

I guess we could do the following if this is expected behaviour:

#if !DEBUG
@_noLocks
#endif
func add() { ... }

@_noLock wasn't used in the linked code sample.

Re -Onone: I personally haven't used @_noLocks but maybe someone else can comment on this.

No. @_noLocks should apply the compiler optimizations necessary to achieve the "no locks" constraint regardless of optimization mode.

2 Likes

Great news.
Please clarify what would happen to the code shipped (to the AppStore, etc) with -Onone (strange, but it happens). Will the individual function marked with @_noLocks be effectively compiled with -O while the rest of the code compiled with -Onone ? Or will it be compiled with "-Onone" (potentially taking locks)?

I believe that's what happens now (Erik might be able to confirm), but it wouldn't necessarily always be the case. We would probably ultimately want to run only lighter guaranteed transformations that eliminate potential sources of runtime locks without sacrificing debuggability of the code.

1 Like

The issue is that the deinit devirtualizer is not yet turned on for move only types, so it must allocate metadata right now to call the destroy value witness function. In this example:

struct Hello: ~Copyable {
    let counter = Atomic(0)
}
output.Hello.init() -> output.Hello:
        mov     qword ptr [rax], 0
        ret

destroy value witness for output.Hello:
        push    rbx
        mov     rbx, rdi
        lea     rdi, [rip + (demangling cache variable for type metadata for Synchronization.Atomic<Swift.Int>)]
        call    __swift_instantiateConcreteTypeFromMangledName
        mov     rcx, qword ptr [rax - 8]
        mov     rcx, qword ptr [rcx + 8]
        mov     rdi, rbx
        mov     rsi, rax
        pop     rbx
        jmp     rcx

You'll see that Hello's initialization doesn't allocate metadata for Atomic, but it's only in Hello's destroy that it allocates it right now (this is resolved, but the optimizer pass is turned off like I mentioned).

2 Likes

Regardless of whether the devirtualizer is enabled, I don't think we ever serialize the SIL record for the deinit yet to make it available across modules. We would need to do that for @frozen public types as well for the devirtualizer to have the information it needs.

1 Like

I think we record this information as part of the MoveOnlyDeinitList.

And if you explicitly turn on the optimizer pass, then you get some really nice codegen:


output.foo() -> Swift.Int:
        mov     qword ptr [rsp - 8], 1
        lock            inc     qword ptr [rsp - 8]
        mov     rax, qword ptr [rsp - 8]
        ret

output.increment(Synchronization.Atomic<Swift.Int>) -> ():
        lock            inc     qword ptr [rdi]
        ret
2 Likes

Interesting, I thought these are only diagnostics and they don't change generated code?