Better existentials (or: i want to stop switching over endless enums)

i hope i’m not the only one when i say i have way too many enums whose purpose is to simulate polymorphism:

enum FixedPage
{
    case home(HomePage)
    case blog(BlogPage)
    case profile(ProfilePage)
}
extension FixedPage
{
    var location:URI
    {
        switch self
        {
        case .home(let page):    return page.location
        case .blog(let page):    return page.location
        case .profile(let page): return page.location
        }
    }
}

AI copilots are very good at generating the necessary boilerplate without much effort. still, i can’t escape the feeling that this would be so much more naturally modeled as a protocol with witness requirements. in particular:

  • i do not need exhaustiveness checking. in fact, i do not want the exhaustiveness, i want to be able to define new page types without asking the copilot to regenerate all the “switch-for-witness” shims.

  • there are few generics involved in this code, so the usual inlining/specialization concerns don’t apply.

protocol FixedPage
{
    var location:URI { get }
    var title:String { get }
    ...
}

now, the problem with any FixedPage is that it imposes a three word size limit on conforming types, and all of the known page types are structs that are far larger than can fit in a standard existential without heap allocation.

so, i wonder if there could be a way to explicitly specify the desired memory layout of the existential, along the lines of:

@frozen(atLeast: HomePage)
@frozen(atLeast: BlogPage)
@frozen(atLeast: ProfilePage)
protocol FixedPage
{
     ...
}

or in the case of conforming types defined in downstream modules, the ability to specify a minimum stride in bytes:

@frozen(minimumStrideBytes: 96)
protocol FixedPage
{
     ...
}

thoughts?

1 Like

Is heap allocation bad for your project?

It sounds like you should use an existential regardless since they're a more natural fit for your problem domain. I suspect the overhead of copying existentials is probably a bigger overall performance drain than whether the payload is stored inline or not, and maybe you can use borrowing and consuming parameter modifiers now to mitigate that. Copying and destroying large enums have their own costs because of the need to switch over the various payloads and retain/release in the right places, so the enum isn't necessarily faster even if it does manage to avoid heap allocation; an overflowed existential box by comparison is copy-on-write so manipulating the existential box is only a retain or release until it's modified. It would be interesting to let protocols state their maximum conforming type size, though.

7 Likes

right, i’m already in the process of refactoring this stuff into existentials, and i have to say, they just make a lot more sense for what i’m trying to model semantically. i suppose i am just a wee bit uncomfortable with them since the mantra around here has been “existentials = bad” for so many years.

it would be really interesting if someone were to do a study into the performance of existentials with a modern toolchain and either dispel this myth or confirm what we suspected all along. :slight_smile:

so, i’m well aware that copying existentials is slow - indeed this is part of the reason i am “afraid” of existentials. but, what exactly makes it so slow anyways? i did a survey of my enum payload structs, all of them at least one array/object field, many of them contain multiple such fields, and my understanding is that those all need to be retained and released individually when the struct is copied and destroyed. are there any reasons why retaining one existential would be worse than retaining many stored array/object properties?

i don’t know if we want a maximum per-se, i would still want graceful degradation in the case where the conforming type is too large to fit in the hinted size. rather, i would want to express something along the lines of “this existential should be able to store an SIMD4<Int> without indirection”.

There's some amount of additional work necessary to check whether the contained type is stored inline or not, so we then decide whether to use the contained type's value witnesses or the out-of-line box. I don't think that should be much more expensive than an enum that was notionally defined as

enum MyExistential {
  case inline<T: P>(T)
  case outOfLine(AnyObject)
}
2 Likes

Try it with a good old classes...

class FixedPage {
    var location:URI { fatalError() }
    var title:String { fatalError() }
}

class HomePage: FixedPage { ... }

Or at least class bound existentials:

protocol FixedPage: AnyObject { ... }

Although I don't think there would be any speed differences between using classes and class bound existentials.

You're right that this has been a feature of the discourse surrounding existentials, but it's important to remember that if existentials were outright bad or useless under all circumstances, it would be silly for us to have them in the language at all.

IMO, the context behind the community discouraging the use of existentials is that they can be less optimal than generics, until recently were much more limited than generics, did not interoperate with generics, while at the same time being so much more convenient to write than generic code.

People would have some data types conforming to a protocol interface, and write code like this:

func doSomething(_ data: MyProtocol) { // <- accidental existential!
  ...
}

And the community would often try to educate them about the subtleties of what their code was expressing, and steer them towards generics for better performance and greater capabilities (e.g. you can use associated types with generics; but it would break all code that used protocol existentials. Also you'd get P does not conform to P errors when trying to pass an existential to generic code, so they became viral).

We're in a much better place these days. You can limit your use of existentials just to heterogenous value storage, and they can interoperate much more smoothly with generic code.

In terms of performance, if the compiler can statically determine the type inside an existential, it can specialise generic functions, so the performance is exactly the same as using generics. In cases where it can't statically determine the underlying type, I think you should be able to prompt specialisation by using @_specialize.

Note that @_specialize also supports layout constraints. We don't have a minimumStrideBytes constraint, but we do have _TrivialAtMost(SizeInBits) for maximum size (although... err... I don't think it actually works; every compiler version I tried crashed when using it :sweat_smile:)

3 Likes