[Pitch 3] Section Placement Control

kubamracek · January 24, 2025, 5:39pm

Hello!

I'd like to pitch a new version of the "section control" or "low-level linkage attributes" proposal that was discussed before (pitch 1, pitch 2).

Note that the pitch is based on another feature that is currently being pitched and does not officially exist in the language yet, Swift Compile-Time Values, (pitch, discussion thread), and it's worth reading before or together with this proposal.

Any feedback is welcome!

Current version of the pitch is on GitHub:

swift-evolution/proposals/0nnn-section-control.md at section-placement-control · kubamracek/swift-evolution · GitHub

github.com/kubamracek/swift-evolution

proposals/0nnn-section-control.md

section-placement-control

# Section Placement Control

* Proposal: [SE-0NNN](0nnn-section-control.md)
* Authors: [Kuba Mracek](https://github.com/kubamracek)
* Status: Pitch #3
* Discussion threads:
  * Pitch #1: https://forums.swift.org/t/pitch-low-level-linkage-control-attributes-used-and-section/65877
  * Pitch #2: https://forums.swift.org/t/pitch-2-low-level-linkage-control/69752
  * Pitch #3: TBD

## Introduction

This proposal builds on top of [Swift Compile-Time Values](https://TBD) and adds two new attributes into the Swift language: `@section`  and `@used`. These allow users to directly control which section of the resulting binary should globals variables be emitted into, and give users the ability to disable DCE (dead code elimination) on those. The goal is to enable systems and embedded programming use cases like runtime discovery of test metadata from multiple modules, and also to serve as a low-level building block for higher-level features (e.g. linker sets, plugins).

The intention is that these attributes are to be used rarely and only by specific use cases; high-level application code should not need to use them directly and instead should rely on libraries, macros and other abstractions over the low-level attributes.

The scope of this proposal is limited to compile-time behavior and compile-time control. We expect that full user-facing solutions for features like linker sets, test discovery or plugins will also require runtime implementations to discover and iterate the contents of custom sections, possibly from multiple modules. This proposal makes sure to provide the right building blocks and artifacts in binaries for the runtime components, but doesn’t prescribe the shape of those. However, it is providing a significant step towards generalized and safe high-level mechanisms for those use cases. See the discussion of that in the sections [Runtime discovery of data in custom sections](#runtime-discovery-of-data-in-custom-sections) and [Linker sets, plugins as high-level APIs](#linker-sets-plugins-as-high-level-apis) in Future Directions.

## Motivation

This file has been truncated. show original

grynspan · January 24, 2025, 6:18pm

Swift Testing would very much like this feature so we can do proper test discovery (instead of looking in the type metadata section) and eventually support Embedded Swift!

kubamracek · January 24, 2025, 8:05pm

Let me also highlight one part of the proposal with a few more examples -- the distinction between @const and @constInitialized:

// All the following examples are globals:

// (1) @const on a declaration makes the variable a "compile-time value", available for compile-time computations
@const let pointerBitWidth: Int = 64

// (2) @const declarations can be more complex than just trivial types
@const let complex: (Int, String, InlineArray<4, Int>) = (42, "hello", [1, 2, 3, 4])

// (3) @const might in the future allow even types that cannot be represented without runtime initialization,
// but that's okay -- @const declarations are only "compile-time values" during compile-time
@const let dict: [String: String] = ["hello": "world"] // available as a compile-time value, but at runtime will need
													   // runtime initialization due to randomized hash seeding

// (4) we can require a variable to be represented without runtime initialization ("const initialized" or "statically
// initialized")
@constInitialized let s: Int = 64

// (5) @constInitialized will accept most types that @const does, but in the the future, we might allow e.g.
// dictionaries to be @const, but not @constInitialized
@constInitialized let dict: [String: String] = ["hello": "world"] // ❌

This also explains why @const itself does not (and should not) guarantee static initialization -- it would preclude us from expanding the scope of @const to more use cases in the future. See also https://github.com/artemcm/swift-evolution/blob/const-values/proposals/0nnn-const-values.md#abi-and-memory-placement-and-runtime-initialization.

allevato · January 24, 2025, 8:12pm

Given these two examples:

@const let someNumber1: Int = 64
@constInitialized let someNumber2: Int = 64

Would it be accurate to say that someNumber1 could be completely optimized out by the compiler (since it's a compile-time value, it could be passed around directly so no storage would ever need to be allocated for it), while someNumber2 is the opposite—static storage must be allocated for it because that's what's being explicitly requested?

kubamracek · January 24, 2025, 8:18pm

I would not expect that @constInitialized globals would be mandatorily materialized. For example if such a global is used in an unused subsystem of your program, it should still be eligible for dead-stripping. But I would expect that if that global is materialized (subject to optimizations), then it will be guaranteed to be statically initialized.

allevato · January 24, 2025, 8:20pm

Right, that's fair. So @constInitialized @used would be the correct approach to forcing it to be materialized then? (Or @constInitialized @section(...).)

kubamracek · January 24, 2025, 8:25pm

In the draft of the proposal, @section implies @constInitialized:

// These are equivalent:
@section("...") let x = 42
@section("...") @constInitialized let x = 42

because it doesn't make sense to put anything into a section without a guarantee of static initialization.

@used would exactly be a way to force a global to be materialized, and @constInitialized is a separate toggle to request static initialization. I think that all 4 combinations are sensible:

let x = 42                         // normal global
@used let x = 42                   // cannot be DCE'd
@constInitialized let x = 42       // if materialized, then will be statically initialized
@constInitialized @used let x = 42 // must be materialized, must be statically initialized

michelf · January 24, 2025, 10:14pm

If @section implies @constInitialized, and @constInitialized is basically @section with a default chosen for you, wouldn't it make more sense for both to be the same attribute? Maybe like this:

// using default section
@constInitialized
let a = 1

// using custom section
@constInitialized(section: "__DATA,mysection")
let b = 1

xwu · January 26, 2025, 10:28pm

[Reposted from the other thread, which was an oopsie—]

On top of specifying a custom section name with the @section attribute, marking a variable as @used is needed [...]. When using section placement [...], such values are typically going to have no usage at compile time [...], therefore we the need the @usedattribute.

Does the pitched design present the right default, then, by separating a very common use case into two separable attributes?

If, as I understand it, "typically" one will want to mark anything with both @section and @used, shouldn't @section imply @used by default and dead-strippability could be opt-in via another parameter (i.e., @section("...", strippable: true)?

Attribute @constInitialized

Static initialization of a global can be useful on its own, without placing data into a custom section

...such as for ____? While I don't argue against the point, given that @section implies static initializability, a dedicated spelling for just this doesn't seem to be necessary for the overarching feature (section placement control).

Since the @const pitch isn't yet to a point where the distinction between @const and @constInitialized can be exercised, and since there aren't any motivating use cases named in this pitch either, it seems to me better to defer this explicit feature to be holistically designed alongside that potential future direction for @const rather than right now.

kubamracek · January 27, 2025, 5:03pm

There's definitely use cases where @section and @used will be used together, but there's also pretty clear use cases for only doing section placement without @used -- one is described in the proposal itself (under @section implying @used), another one could be co-locating a group of global variables for better cache locality or other performance reasons.

Similarly, it's useful to only disable strippability on a global, without placing it in a custom section. For example, a "watermark" or some other meta-information that we just want to preserve in the binary. Furthermore, I expect we might want to expand @used to apply to more declarations than just globals in the future (e.g. a whole type), which would be precluded if we build strippability control as part of a section placement attribute.

Ultimately, both section placement and strippability control are tools that we need to have in the toolbox and they are orthogonal concepts in the view of a systems programmer who cares about and knows how the code/symbols are going to behave during linking. Thus it seems best to present them as separate toggles in the language.

kubamracek · January 27, 2025, 5:08pm

michelf:

If @section implies @constInitialized, and @constInitialized is basically @section with a default chosen for you, wouldn't it make more sense for both to be the same attribute? Maybe like this:
// using default section
@constInitialized
let a = 1

// using custom section
@constInitialized(section: "__DATA,mysection")
let b = 1

I expect we will want to expand @section in the future to apply to code (functions, closures, and more), where @constInitialized is not relevant and the syntax of @constInitialized(section: ...) would be nonsensical on a function declaration. But the concept of placing symbols into a section is still the same between data and code, so it should a design goal to have a unified syntax for that -- that's why @section seems like the best option.

kubamracek · January 27, 2025, 5:57pm

Let me post a quote from @fclout from the other thread, https://forums.swift.org/t/pitch-3-swift-compile-time-values/77434/12:

Not that I have a strong opinion on whether @constInitialized is the best way to get that, but there is definitely value in ensuring that a global variable doesn't need runtime initialization. If you have a large global variable (like a large array) that the compiler must produce a runtime initializer for, the code to initialize that global will be bigger than the global itself, often 1.5x or 2x bigger. If binary size matters for you (for instance, because you have a firmware target), this is a problem, and an attribute that ensures it either has no runtime initializer or diagnoses is one way to solve it.

You're right that a dedicated spelling for @constInitialized is not strictly necessary to build a section placement feature, however, the proposal includes it because it's extremely relevant and related.

It's a good point that the proposal doesn't clearly provide the motivation and use cases, and I can improve the proposal on that front. For embedded/firmware/low-level engineering, a standalone "no runtime initialization" toggle is very valuable for multiple purposes: 1) performance tuning, and specifically avoiding the infamous "performance cliff", 2) binary size tuning (same reasoning, see what @fclout says above), 3) simpler reasoning about initialization ordering (no need to worry about @constInitialized globals, 4) it's a necessity for having a stable pointer address on such a global.

allevato · January 27, 2025, 6:16pm

Another concrete example that applies not just to embedded environments, but mobile/server as well:

swift-protobuf currently generates a large amount of code for the message types to support a wide variety of operations: parsing, serializing, stringifying, testing equality, and so forth. We would love to convert all of these to table-driven algorithms where the only thing we have to generate is a table of field numbers mapped to information that tells the runtime how to manipulate that field, and the operations themselves are generic and live in the runtime.

This isn't possible today because the compiler is unpredictable when it comes to optimizing these kinds of static tables into constant data; there's no way to know when you'll get the table or when you'll get imperative code that just allocates and initializes the data on first access, which is another huge performance cliff. And since that static outlining depends on the optimizer, it would never occur for debug builds, making the performance of those much worse than they actually need to be (and worse than the direct code that we generate today).

This is an example where all we would need is @constInitialized; we don't particularly care which section it's generated in, and we wouldn't want it to imply @used either because if a message type is stripped, we'd want its tables to be strippable as well.

Lancelotbronner · January 28, 2025, 2:16pm

Would an alternate spelling like @const(initialized) do? There seems to be two nomenclature in recent pitches, @namespace(attribute) and @namespaceAttribute. I personally prefer the first one.

Joe_Groff · February 6, 2025, 6:50pm

For the use case of placing metadata to be discovered by the runtime/libraries/out-of-process tools, @grynspan raised an interesting point that those other tools need to be able to know the format of the data to be able to interpret it. However, we still try not to make any guarantees about general struct/enum layout, with the hope that some time in the future we could do automatic layout optimization like most other languages claiming to be modern systems languages do. That could pose a problem for using user-defined structs to construct constant data for external consumption. Primitive numeric, pointer, tuple, and InlineArray types are unlikely to ever change layout, so it might be good advice to only use those types when constructing externally-parsable data. (I would hesitate to suggest we make that a hard limit on what's allowed in a @section-ed variable, though, since there are use cases for it outside of metadata encoding.)

allevato · February 6, 2025, 7:01pm

Structs imported from C should be safe to use here as well, right?

grynspan · February 6, 2025, 7:06pm

Yes, to the degree that a C structure's layout is well-defined. Stuff like padding and endianness is, of course, target-specific, but then you have other language features like C bitfields that can be unsafe to persist because they are implementation-specific under the C standard.

So, I guess… short answer "yes", long answer "nothing is sacred."

grynspan · February 6, 2025, 7:08pm

I think a carve-out for @frozen types is also reasonable since we can't change their layout without a Swift ABI break. Right?

allevato · February 6, 2025, 7:09pm

Right, but that situation is at least no worse than defining static data in C using C structs. I was mainly focused on the "Swift doesn't guarantee layout" aspect, and our usual recommendation for folks who want to define a struct that has a compatible layout with C is to define it in C and import it, so that seems like a viable approach for static data here as well.

(It does somewhat become unfortunate for projects like swift-protobuf, which explicitly have no C dependencies and would be nice to keep that way. So we would probably not end up using it.)

grynspan · February 6, 2025, 7:12pm

We're in agreement here. I'm just being cautious because I don't want to accidentally suggest that we've somehow solved all such issues by importing a C type into Swift, or anything along those lines.

[Pitch 3] Section Placement Control

Attribute @constInitialized

Attribute `@constInitialized`