jcavar
(Josip Cavar)
1
I have the following code snippet:
import Synchronization
struct Counter: ~Copyable {
let changeCounter = Atomic<Int>(0)
@_noAllocation
@_noLocks
mutating func add() {
changeCounter.add(1, ordering: .sequentiallyConsistent)
}
}
This code reports the violation:
"Using type 'Counter' can cause metadata allocation or locks"
Since Atomic is quite new, I guess performance annotation need to take them into account somehow?
3 Likes
jcavar
(Josip Cavar)
2
This seems something specific to Atomic. I am not able to reproduce this with my ~Copyable types (even if they are generic).
Also, you don't need to interact with atomic variable at all. E.g. this raises the same error:
import Synchronization
struct Counter: ~Copyable {
let changeCounter = Atomic<Int>(0)
var x = 1
@_noAllocation
@_noLocks
mutating func add() {
x = 2
}
}
@Erik_Eckstein do you maybe know what is happening here?
Alejandro
(Alejandro Alonso)
3
I imagine the issue is just that Atomic is generic so theoretically it may cause a metadata allocation if the generic argument is not statically known or if the compiler doesn't specialize the type. In reality, for Atomic it's almost always statically known and every operation on it is fully transparent and emitted into clients for the compiler to eliminate any generic shenanigans and lower these operations down to single atomic ops. Perhaps we should teach the compiler about this type for @_noLocks and @_noAllocation?
3 Likes
scanon
(Steve Canon)
4
Yeah, this is a common problem with generic-but-always-specialized-in-practice stuff.
1 Like
Joe_Groff
(Joe Groff)
5
I would expect the current implementation to be able to handle a concrete use of a generic type like this, as long as it's actually getting fully specialized. We should check whether some generic part of the implementation isn't @inlinable, or the compiler is deciding it's unable to specialize and leaving behind metadata accesses. cc @Erik_Eckstein
1 Like
jcavar
(Josip Cavar)
6
Yeah, using Compiler Explorer reveals calls to:
swift_getTypeByMangledNameInContext2
Example:
dnadoba
(David Nadoba)
7
The linked godbolt is without optimisations turned on so no surprise but it also doesn't change if compiled with -O.
It looks like the initialiser is causing problems. Operations like wrappingAdd seems fine.
The Atomic initialiser is not @inlinable but has the @_alwaysEmitIntoClient and @_transparent attributes which sounds like it should be enough but this is outside of my expertise.
jcavar
(Josip Cavar)
8
Shouldn't Atomic guarantee no metadata access without relying on optimiser?
1 Like
dnadoba
(David Nadoba)
9
Maybe? I don't think it has been spelled out explicitly and it would be something new. @_transparent usually does a good job at enabling optimisation in debug builds but I'm not aware what it actually guarantees.
Joe_Groff
(Joe Groff)
10
It's my understanding that in a @_noLocks/@_noAllocations function that we're supposed to try to optimize out calls that require metadata instantiations even if the compiler wouldn't normally decide to (assuming that's fundamentally possible, with the functions in question being @inlinable or in the same module, of course).
4 Likes
tera
11
Is it considered ok if the @_noLocks code compiles fine in -O but fails to compile in -Onone ?
I guess we could do the following if this is expected behaviour:
#if !DEBUG
@_noLocks
#endif
func add() { ... }
dnadoba
(David Nadoba)
12
@_noLock wasn't used in the linked code sample.
Re -Onone: I personally haven't used @_noLocks but maybe someone else can comment on this.
Joe_Groff
(Joe Groff)
13
No. @_noLocks should apply the compiler optimizations necessary to achieve the "no locks" constraint regardless of optimization mode.
2 Likes
tera
14
Great news.
Please clarify what would happen to the code shipped (to the AppStore, etc) with -Onone (strange, but it happens). Will the individual function marked with @_noLocks be effectively compiled with -O while the rest of the code compiled with -Onone ? Or will it be compiled with "-Onone" (potentially taking locks)?
Joe_Groff
(Joe Groff)
15
I believe that's what happens now (Erik might be able to confirm), but it wouldn't necessarily always be the case. We would probably ultimately want to run only lighter guaranteed transformations that eliminate potential sources of runtime locks without sacrificing debuggability of the code.
1 Like
Alejandro
(Alejandro Alonso)
16
The issue is that the deinit devirtualizer is not yet turned on for move only types, so it must allocate metadata right now to call the destroy value witness function. In this example:
struct Hello: ~Copyable {
let counter = Atomic(0)
}
output.Hello.init() -> output.Hello:
mov qword ptr [rax], 0
ret
destroy value witness for output.Hello:
push rbx
mov rbx, rdi
lea rdi, [rip + (demangling cache variable for type metadata for Synchronization.Atomic<Swift.Int>)]
call __swift_instantiateConcreteTypeFromMangledName
mov rcx, qword ptr [rax - 8]
mov rcx, qword ptr [rcx + 8]
mov rdi, rbx
mov rsi, rax
pop rbx
jmp rcx
You'll see that Hello's initialization doesn't allocate metadata for Atomic, but it's only in Hello's destroy that it allocates it right now (this is resolved, but the optimizer pass is turned off like I mentioned).
2 Likes
Joe_Groff
(Joe Groff)
17
Regardless of whether the devirtualizer is enabled, I don't think we ever serialize the SIL record for the deinit yet to make it available across modules. We would need to do that for @frozen public types as well for the devirtualizer to have the information it needs.
1 Like
Alejandro
(Alejandro Alonso)
18
I think we record this information as part of the MoveOnlyDeinitList.
Alejandro
(Alejandro Alonso)
19
And if you explicitly turn on the optimizer pass, then you get some really nice codegen:
output.foo() -> Swift.Int:
mov qword ptr [rsp - 8], 1
lock inc qword ptr [rsp - 8]
mov rax, qword ptr [rsp - 8]
ret
output.increment(Synchronization.Atomic<Swift.Int>) -> ():
lock inc qword ptr [rdi]
ret
2 Likes
jcavar
(Josip Cavar)
20
Interesting, I thought these are only diagnostics and they don't change generated code?