Performance annotations

Erik_Eckstein · January 5, 2022, 6:55am

Folks have asked for ways to better control the performance of compiled code, so we're currently experimenting with an approach that we think might work well. This isn't ready to be a formal proposal yet, but we'd love to get more input and ideas from interested people so that we can eventually make this into an official part of the language.

In some usage domains, like system-level programming, it is very undesirable that the compiler implicitly generates calls to allocating or locking runtime functions.
We’d like to introduce a set of function annotations which help the developer to avoid unwanted runtime calls in certain parts of the code.

Motivation

Swift does not make any guarantees on which code patterns can call what runtime functions, which makes developing performance aware code very hard. For example, following language features can implicitly perform memory allocations:

escaping closures allocate storage for captured values
metadata instantiation, e.g. when calling a generic function or when doing a dynamic type cast
using copy-on-write types, like Array
creating existential types which exceed the size of three words

Following language features can implicitly perform a lock:

static and global variables, because they are lazily initialized
protocol conformance lookup

Without a deep compiler knowledge it’s difficult to write Swift code which avoids memory allocation or locks.

Proposed Solution

We propose new annotations that developers can use to mark functions that should not have any implicit compiler-inserted allocation or lock operations. If an annotated function contains some code which requires unwanted runtime operations, the compilation will fail with an error explaining the reason.

Detailed Design

Two function annotations can be used to enforce performance characteristics:

@noAllocation
func foo() { ... }

@noLocks
func foo() { ... }

The compiler analyses the runtime impact of an annotated function, i.e. checks if a certain code pattern requires allocations or locks. The analysis is conservative. That means, if the compiler is not sure about if a certain code pattern, it assumes the worst case, i.e. that it allocates or locks. For example, it will consider any type metadata runtime functions as potentially allocating (which is not true in general).

Functions, which are annotated with @noAllocation are assumed to not allocate any heap memory. If this cannot be proved by the analysis, the compiler issues an error which describes the reason why a function does or could allocate.
Functions, which are annotated with @noLocks are assumed to not do any locks. Again, if this cannot be proved by the analysis, the compiler issues an error which describes the reason why a function does or could lock.

The attribute @noLocks implies noAllocation because allocating heap memory also performs a lock.

Global and static variables

In Swift all global and static variables are lazily initialized, which involves a lock.
But the compiler can rewrite globals of trivial types (e.g. Int) to be initialized in the data section. Therefore using such global variables don’t need a lock and using them in noLocks functions is permitted.

Generic Functions

It is not be possible to put performance annotations on generic functions. In most cases the performance characteristics depends on the actual types for which a generic function is called. This cannot be expressed with simple annotations like @noLocks or @noAllocation.
But generic functions are a fundamental and important language feature. Therefore it is possible that performance annotated code calls generic functions. The compiler eagerly specializes generic functions which are called from performance annotated code. This avoids creating metadata for the generic parameters.

Escape hatch

To silence the performance diagnostics in a certain code area, the developer can use a compiler-known unsafePerformance function. This is useful if either the developer knows that the callee is not allocating/locking or if it’s okay that the called function allocates or locks.

@noAllocations
func foo() -> Int {
    return unsafePerformance { doesAllocate() } // no diagnostic
}

The unsafePerformance function takes a closure. For all code within the closure, no performance diagnostics are issued.

Transitivity

In case of a function being called from a performance annotated caller, the compiler will also issue a diagnostic if the called function does or could allocate or lock.
For calls to functions in the same compilation unit, the compiler can derive this information either by analyzing the called function (if the function body is available) or if the called function is annotated itself. For example:

func add(x: Int, y: Int) -> Int {
    return  x + y
}

@noAllocations
func somethingMoreComplex() { ... }

@noAllocations
func timeCriticalFunction() {
    let a = add(x: 1, y: 2)   // no diagnostic
    somethingMoreComplex()    // no diagnostic
}

For calls where the callee is not known, the compiler can only assume that it’s allocation or lock free if the declaration is annotated. For example:

protocol methods
class methods

Error Handing

It is assumed that errors are only thrown in an exceptional case where performance is not relevant anymore.
Therefore, code which is within a throw or catch path is excluded from performance diagnostics.

Effect on ABI stability

None.

Effect on API resilience

There is no effect on API resilience for existing, not annotated code.
Removing a performance annotation from a function in a new version of a library would break the performance assumptions of a client. Therefore, removing a performance annotation is an API break.

Future Directions

One of the most important features to add is a safe and allocation free alternative to Swift’s Array. Currently the only option for an allocation free array-like data structure is to use unsafe APIs, like UnsafeMutableBuffer.

Torust · January 5, 2022, 8:01am

This looks great as a general feature to add, and would be something I'd find very useful.

In terms of the annotations, however, I wonder if it'd be a more natural fit to attach these as effects to functions in the same way as async and throws. For example:

func someFunction() nonallocating throws
func someOtherFunction() lockfree throws

We'd also likely want to consider the equivalent of reasync or rethrows – functions that are only lock-free or non-allocating if the closure passed in doesn't allocate or lock.

As a future direction and on a different note, I wonder if we might want an annotation to restrict allocations to particular allocators (in some hypothetical future where Swift natively supports custom allocators). For example, in a game engine scenario, you might want all allocations for a frame to come from a linear allocator whose lifetime is bounded to that frame. The proposal as written already helps to enable that, since you could use a combination of unsafePerformance and @noAllocations to ensure that all allocations are manually done, but I could also see a allocators(none) vs. allocators(someParameter)-type annotation being useful. That's likely over-engineering at this point, but I think it's worth considering as a hypothetical.

tera · January 5, 2022, 12:29pm

Will @noLocks mean realtime safe? You may want to mention whether realtime considerations are (or are not) part of the pitch in the pitch description. Note, realtime is not about performance per se, it is about worst case execution time guarantees.

We don't do this for "throws" or for "async" annotations, so -1 on this particular part.

+1 overall, especially if this pitch is applicable to realtime code (e.g. code executed from the context of realtime audio threads, etc).

michelf · January 5, 2022, 1:16pm

Perhaps a naive question, but is there a reason this is two attributes? Is there a situation where you'd want to allow locks but not allocations?

masters3d · January 5, 2022, 2:51pm

Very cool. Could a nonLocking func be used to produce compile time const declaration?

noAllocation -> noLocking -> compileTimeConst?

Including throwing in these performance annotations makes sense only if the errors are not caught and swallowed by a catch within the performance annotated chain of functions. Performance code is an advance feature imo so I am thinking folks can use Result enums for error propagation within performant code.

Not sure if the scape hatch should say unsafe in the name. Perhaps unknownPreformance or uncheckedPerformance might convey the message a little better.

itaiferber · January 5, 2022, 3:25pm

I think these annotations can be really interesting, but perhaps I'm missing something about the intended use-case here, if you could elaborate: is the intent here that these annotations describe to API consumers the type of performance they can expect from a given function (w.r.t. locking + allocation), or is it the intent that these annotations actually change codegen for these functions? (Or both?) From the intro + Motivation sections, it sounds to me like the latter, but the Detailed Design and API Resilience sections makes it sound like the former ("Functions... are assumed to not...", "break the performance assumptions of a client").

If these annotations change codegen, then I think the biggest question in my mind is: when would I not want to apply these annotations, and why is this behavior not the default? i.e., when would I actually want Swift to introduce locks and allocations into my code when it can avoid it? (And what might stop me from cargo-culting application of @noLocks all over my codebase because I think it might improve performance?)

If these annotations don't change codegen but are primarily intended to document that "this function can do its job without locking/creating allocations", I think this could be pretty great for API consumers. (Though from my read of the pitch, perhaps it should be more clearly called out that these annotations don't actually change the performance characteristics of the code they're applied to; and maybe I'm the only one!)

Either way, exciting!

Philippe_Hausler · January 5, 2022, 3:44pm

So there is an interesting effect here. The postulation of no allocations or no locks presumes that either a) the functions being called are annotated or b) they are somehow transparent to the compiler.

The moment something crosses the C/ObjC boundary then it means that all bets are off per the propagation of that. If something is marked as @_cdecl or gods forbid... @_silgen_name. Then it may have the annotation but then report incorrectly.

The even more common case is that someone has a function that is trivial and can be fully slurped up with inline, but not marked as either - it is perfectly reasonable as an API author to change the implementation of something completely inlined. Normally changing that without changing the signature is legitimate and does not break ABI or API. Now with these annotations we run the risk of potentially breaking API.

Selfishly I am also concerned about API surface that we need to now mark throughout the SDKs that need to be marked with this that provide locking. Things like NSLock are easily identified but other things that use dispatch_once may not be so obvious (which strictly speaking does an atomic operation). So the question is: where do we draw the line of what is locking and what is not, and what is allocation and what is not? After all I am not sure there is a decent malloc implementation that does not somehow use locking...

Philippe_Hausler · January 5, 2022, 4:48pm

Another common hit for perf is network/disk access. Have we considered the future expansion of annotations that may highlight other types of things like that?

Chris_Lattner3 · January 5, 2022, 5:18pm

w.r.t the name unsafePerformance, I'd recommend a more specific name that conveys what is being allowed. unsafe has a specific connotation around memory safety, and we don't want to dillute/confuse that.

-Chris

Philippe_Hausler · January 5, 2022, 5:20pm

I tend to agree; would uncheckedPerformance be better?

Saklad5 · January 5, 2022, 5:41pm

I always interpreted unsafe as “preconditions aren’t checked in -O builds”.

hisekaldma · January 5, 2022, 5:41pm

I wonder if it would make sense to use the word "known" somewhere in the names to call this out. Something like @knownNotAllocating and @knownNotLocking?

100% this.

Philippe_Hausler · January 5, 2022, 6:29pm

It also connotes concurrency safety - withUnsafeContinuation. I think though it still can dilute the meaning because it seems like all the "unsafe" things are potential crash or deadlocks if mishandled. UnsafePointer can de-reference and wander off into junk memory. UnsafeContinuation can attempt to resume a dead task, or stall out. Whereas this particular feature is not about correctness of execution but of impact if something is behaving performance wise as expected.

To me this feels like the same concept as @unchecked Sendable hence my naming suggestion of withUncheckedPerformance<T>(_ apply: () throws -> T) rethrows -> T. The issue there is that it is all performance characteristics that are unchecked. I would wonder if there is a need for a withUncheckedMemoryPerformance as well as a withUncheckedContentionPerformance?

The open questions for me are:
Do these only apply to runtime behaviors? Or do they apply to all APIs that can do this?
If it applies to all APIs; do pooling of storage types count as allocation?
Do atomics count as locking?
Do semaphores count as locking?
Do once like flags count as locking? (the post seems to infer this is true)
Does thread local access count as locking?

Some of the answers are obvious to me after reading the pitch but it might be nice to clearly define what concrete calls are categorized as and which ones are not.

Side question:
Could either of these annotations ever be reasonably applied to an async function? I am thinking the answer is likely no?

I find locking/atomicity falls into a few categories of performance - it is considerably different to look at a highly contended lock in a perf trace than to look at a single shot atomic like a dispatch_once. Realtime audio may be able to handle atomics but may not be able to handle highly contended locks. Being able to discriminate between those granularities is likely important.

Minor suggestion:

That sentence seems backwards to me, noAllocation should imply noLocks since allocating heap memory may perform a lock. (granted some heap allocation systems have thread local buckets to avoid locking)

Lantua · January 5, 2022, 6:34pm

I brought this up during the review:

hassila · January 5, 2022, 6:48pm

Definitely interested in this direction - would be fantastic.

I would consider having the same granularity for allowing certain kinds of performance code as can be used for checks - so it’s symmetric. I might want to allow a specific function to allocate for example but if later additional locking is added I want my compilation to fail. A blanket unsafe will not catch that - and if considering e g. A future @noSyscalls (which would be awesome also as an annotation) or I/O annotations in the future it becomes even more important with symmetry.

tbkka · January 5, 2022, 9:36pm

Many data containers (such as Array and Dictionary in the standard library) can potentially allocate memory anytime you modify them, either because a modification might trigger copy-on-write or because underlying storage needs to be extended. Similarly, locks are used whenever you access a lazily-initialized global or static variable as well as by the runtime metadata machinery which is invoked whenever you create an existential or perform a runtime cast. So any use of one of these features will fail to compile in an annotated function.

In the current implementation, when you apply one of these annotations, the compiler will verify that the function meets these constraints and will fail with an error if it does not. This verification is conservative: If the compiler is unsure, it will reject the code by default. If you strew these about your existing code base, you're likely to have to rewrite a lot of code. ;-)

We do plan to eventually modify codegen when these annotations are present. However, I don't see any way we can possibly fully eliminate locks and allocations from all code, so annotated function will always be restricted to a subset of the full Swift language.

Obviously, we're not expecting this to be a widely-used feature right away. That's why we're opening up this discussion, to better understand how people might use a feature of this sort, to figure out whether "locks" and "allocations" are the right concepts, and to get help from folks in the community who have been thinking about these kinds of problems for a long time.

tbkka · January 5, 2022, 9:53pm

Certainly, the application areas we've been considering are subject to realtime concerns and we expect this will be a building block for certain types of realtime code. We don't think this is sufficient by itself, however, and we'd love to get more input from people who are actually building such systems to better understand where this falls short and what else we might consider to better support these kinds of applications.

As an extreme example, I believe there are "realtime safe" languages today that prohibit all loops (unless they have constant bounds) and all use of dynamic memory (our proposed annotation would only prohibit operations that resulted in an implicit allocation by the compiler or runtime). That's an awfully restrictive environment, though, so I'm not sure whether it's something we should be exploring or not.

itaiferber · January 5, 2022, 9:53pm

Okay, so it does sound like the former of the two understandings: these annotations are compiler-verified* promises that code does not lock, or allocate memory, whether implicitly or explicitly. Codegen will not change (for now), you will simply be prevented from applying annotations to functions which do require locking or allocation.

Sounds reasonable!

This strays into the weeds and isn't relevant for discussion yet, but I'd be interested to know what changes would be made in the presence of these annotations. (I'm assuming a relaxation of some semantics to prevent codegen from inserting locks/allocations, if possible — though again: if this were possible, I'd be interested to know why this wouldn't be done globally. But this is for a different thread.)

Yep, agreed! I'm mostly asking these questions to understand whether we risk introduction of annotations that imply to folks "slap this on if you want your code to go faster", which is neither the intent, nor the result. This all sounds reasonable to me, though.

Karl · January 5, 2022, 10:07pm

+1! I really like this idea!

I do have some comments though:

I'm also a little unsure about this point - but I would phrase it a bit differently: are these only for performance assertions made by the caller, or is the intent to expand these to things which the function itself wishes to guarantee about its internal implementation?

For example, an attribute to ensure that bounds-checks can be statically eliminated. Most callers probably won't care about that to the same extent that they care about locks/allocations (although maybe they do care; I've seen cases where eliminating bounds-checking allows loops to be merged).

On the bike-shedding side, I'm not sure how well this will scale as individual attributes. I can imagine that we might like to include more of these in future, so I think a combined attribute would work better:

@performanceAssertion(noLocks, noFatalErrors, /*... etc */) 
func myFunc(...)

Also I think the names could be a bit clearer. "no allocations" probably doesn't include stack allocations (withUnsafeTemporaryAllocation) , and there's no way to prove things like arena allocations won't happen. Also, I've heard that task-local allocations can behave a bit differently (or may in future be able to).

But yeah, overall a really nice idea, and I can't wait to use it.

Jon_Shier · January 5, 2022, 10:16pm

I haven't thought about this too much, but this:

Karl:

On the bike-shedding side, I'm not sure how well this will scale as individual attributes. I can imagine that we might like to include more of these in future, so I think a combined attribute would work better:
@performanceAssertion(noLocks, noFatalErrors, /*... etc */) 
func myFunc(...)
Also I think the names could be a bit clearer. "no allocations" probably doesn't include stack allocations ( withUnsafeTemporaryAllocation ) , and there's no way to prove things like arena allocations won't happen. Also, I've heard that task-local allocations can behave a bit differently (or may in future be able to).

is right on. Can we please, please not suddenly add a few dozen attributes or keywords in order to fix Swift's performance problems? I like something less verbose, like @performance(...), but the idea is the same: allow the language to grow new attributes without having them actually be separate attributes.

As an aside, can we please get an overall performance manifesto that outlines the current issues, why these are suddenly a priority, and the areas that will be focused on, rather than just separate proposals for each possible solution? The community is missing a huge amount of context here.