Clarification about semantics of `indirect enum`

jklausa · September 4, 2024, 6:39am

Heyo!

I recently proposed marking a couple of enums at $workProject as indirect to help with how much stack memory those were consuming; but found myself struggling to clearly explain what that actually does; without resorting to pointing to compiler internals like 1.

The only official (that I could find?) mentions of indirect enum in the docs/book, are at Documentation, which only mention "Recursive Enumerations".

That's fair, since it's probably the most common use-case. However, there are other, valid ones, which are semi-frequently discussed on this forum (Memory used by enums - #3 by Joe_Groff, Who benefits from the `indirect` keyword? - #18 by Lantua), but just going off of the "official" docs it's not clear if that's the intended use.

I think it would be worthwhile to more explicitly document what does indirect actually does under the hood, and maybe mention that it's useful in cases where you want/need to optimize for memory usage.

I can try taking a stab at writing something myself; but wanted to check if others think this is something worth clarifying.

hisekaldma · September 4, 2024, 8:18am

I think the docs here are just plain wrong. You do not, as the docs say, ”indicate that an enumeration case is recursive by writing indirect before it”. You indicate that it has indirect storage. This absolutely needs to be rewritten!

Alex_Martini · September 4, 2024, 5:39pm

The indirect keyword is also discussed in the reference:

https://docs.swift.org/swift-book/documentation/the-swift-programming-language/declarations#Enumerations-with-Indirection

CC @Joe_Groff who also reviewed the last revision to this part of the docs in 2017.

EDIT: Here, as elsewhere in the book, the guide tries to tell you the most common reason to use a feature in a way that's approachable to someone learning the language, and the reference tries to tell you the details about a feature.

The Swift book doesn't currently have discussion on techniques for optimizing memory usage — but that's something that we could pitch adding. Content about performance can be a little tricky, since some performance tuning involved depending on implementation details.

jklausa · September 5, 2024, 9:17am

I'm not tied to this being in the Book necessarily!

I'm fine if it lives in OptimizationTips.rst, or somewhere else — but I think it'd be good if there was a place that explained at least a little bit about the internals.

I can try writing something over the weekend that would go in the OptimizationTips, if that'd be useful?

Slava_Pestov · September 5, 2024, 1:15pm

Is there much to it other than “it wraps the case payload in a heap-allocated reference counted box”? The reference counting part is important, since it allows sharing.

The only other details I can think of are that indirect on the enum is equivalent to indirect on each case, and the box also stores generic metadata in a certain way so that the destructor can be dispatched correctly.

jklausa · September 5, 2024, 1:49pm

I think that would mostly cover it! Having a place that "officially" spells the heap-allocation bit is the important part, IMHO.

That bit at least is covered here.

To enable indirection for all the cases of an enumeration that have an associated value, mark the entire enumeration with the indirect modifier — this is convenient when the enumeration contains many cases that would each need to be marked with the indirect modifier.

xwu · September 5, 2024, 5:47pm

It's one thing to write down what currently happens, but I'm not aware that Swift, the language itself, has ever guaranteed "officially" that anything is allocated anywhere.

I think the closest we've come is withUnsafeTemporaryAllocation, which says that it's going to be allocated "on the heap or on the stack," as opposed to (say) being carved by a woodpecker into the side of a tree in the Amazonian rainforest. But I'm not sure that heap allocation is a part of the semantics of indirect.

hisekaldma · September 5, 2024, 6:07pm

Whether or not the allocation is on the heap or on the stack, the semantics surely must be the allocation isn’t inline, at least? That’s been the behavior for what, ten years now?

Nobody1707 · September 5, 2024, 7:05pm

Is there a reason why an indirect case can't box a ~Copyable payload?

Slava_Pestov · September 5, 2024, 7:18pm

I think the only reason is that no decision has been made if it should literally use the same representation, or something simpler/different.

That is, if the payload is concretely known to be ~Copyable, the allocation doesn't need an object header, reference count or generic metadata at all, because you know exactly when to free the memory and how.

Slava_Pestov · September 5, 2024, 7:22pm

Yeah, it's technically possible that you might have a LinkedList type that is realized with indirect enums say, and the compiler is smart enough to prove that an entire list somewhere is actually non-escaping within some function, at which point the heap allocation for the indirect cases could then be transformed into dynamically stack allocations with alloca.

Something that's easier to imagine is you have a global constant whose value is an indirect enum. I believe today we generate code to perform the heap allocations, but we could also stick the indirect cases into static space and set them up so that they appear to have immortal reference counts.

The other thing is that because indirect cases are immutable, we probably shouldn't ever make any guarantees about the reference identity for those boxes. But it's not even possible to observe the identity of the box without resorting to unsafe tricks.