`std::byte` imported as uninhabited type

The following code is accepted, and I feel like it shouldn’t be:

import CxxStdlib

func f(_ x: std.byte) -> UInt8 {
  switch x {}
}

Granted, it does give a warning suggesting to use @unknown default, but I feel like this shouldn’t happen in the first place. The potential for damage is somewhat limited since this isn’t transitive: wrapping it in a struct does not make the entire struct uninhabited, only the one property. However, it is very possible to use this to break memory safety:

func unsafeWhoKnowsWhat<T>(_ x: std.byte) -> T {
  switch x {}
}

Try getting a String out of that and watch what happens. Or get a function out of that and call it. Or any number of other things that might crash at runtime but who knows for sure.

This is a very interesting edge-case that you found and indeed needs to be fixed. The problem here boils down to the fact that in Swift you're not allowed to have an enum with no cases and a raw type, but we're not diagnosing (or respecting) this when importing byte. I will need to think about possible solutions, but I just wanted to validate this as a bug.

For std::byte specifically, we probably want to map it to UInt8. The more general problem is harder to solve.

1 Like

Hmm. Is std::byte an enum with no enumerators? Maybe we shouldn't be importing such types as enums in Swift.

Have you checked that this actually breaks memory safety? I would expect that it would just fail dynamically with a non-exhaustive match.

1 Like

Yep, I believe it’s enum class byte : char {}. It has a raw type but no cases, which C++ is fine with but Swift doesn’t normally allow.

According to Godbolt, it simply deletes main if I try: Compiler Explorer

2 Likes

Strictly speaking, an enum class with an underlying type in C++ can have any value of the underlying type. If we want to accurately portray that in Swift in the general case, it seems like we should import enum classes as having unknown cases in addition to the explicitly declared cases, like we do C enums that don't explicitly promise to be exhaustive. Does clang or the C++ standard have an [[attribute]] or convention that's used to indicate an enum class that only uses its explicitly-declared cases?

2 Likes

If we want to accurately portray that in Swift in the general case, it seems like we should import enum classes as having unknown cases in addition to the explicitly declared cases, like we do C enums that don't explicitly promise to be exhaustive.

Or should we just import them as a typealias?

Have you checked that this actually breaks memory safety? I would expect that it would just fail dynamically with a non-exhaustive match.

Yes, I don't think it violates memory safety, it traps. But it's still a serious problem.

That's what we do with unannotated C enums, but as you're well aware, there's always a tradeoff between being maximally expressive to all the corner cases of C vs. carrying over the idiomatic meaning. C codebases are (IME) a lot more wild-west-ish in how they treat enums, and Apple's ObjC frameworks had established idioms for declaring "real" enums, which is why we went with that approach in C. In C++ by contrast, using an enum class over an enum does seem like a reasonably strong signal that the type is intended to be used as an enum-like type in the common case, but it nonetheless seems risky to assume that the enum class's constants are always exhaustive, since there are people who use them as "strong typedefs" for the underlying type (as std::byte does) or still as bitfields with stricter typing.

Hmm. I'm somewhat of a skeptic about std::byte overall, but still, I feel like this might not match the design intent.

It looks like this is not true according to that godbolt link; the function has UB.

If I remove the -O flag, it does something. Not sure what exactly, but it involves a lot of register manipulation and a call to memset of all things.

Does it have UB? Or is the optimizer just smart enough to inline the unreachable?

If main unconditionally calls unreachable, it’s hard to say that isn’t UB.

The well-defined behavior I would expect this function to have (if we’re not going to statically reject it) is a reliable trap. Compiling it with no instructions and just running into the next function is not well-defined.

2 Likes

Shouldn't that fragment result into "switch must be exhaustive" error?

It’s only a warning, not an error, in this case. std::byte has no explicit cases.

I mean that even if it did have existing cases, I'd still be able passing arbitrary values from the C++ side (right?), so this would not be enough:

switch x {
    case .null: print("the only case specified so far")
    // Error: switch must still be exhaustive
}

Ah, I see. I think it compiles differently locally and in godbolt. Locally I'm getting a (hopefully reliable) trap. Anyway, I agree: unkown enum cases should result in a trap and this snippet should produce an unkown enum case.

Shouldn't that fragment result into "switch must be exhaustive" error?

We are importing a type without fully understanding its semantics, so Swift can't help us out here. That's why I'm saying that this is a bug, and we need to figure out what the correct way to import this pattern is.

I'd still be able passing arbitrary values from the C++ side (right?)

Yes. This is a more general problem with foreign enums. But it seems especially bad in this case where the pattern is to only use the type for its payload.

I don't quite understand. If there was an (enforced) default case – the case that would have returning something – this code would not be able UB-ing. So just issuing that error (or turning a warning into an error) should be a solution. No?

Yeah, I don't know why it's just a warning today. Maybe there were source-compat concerns when this was introduced for ObjC/C. I still think we may be able to find a better semantic representation for this pattern (which would side-step the issue entirely).