Recently I've been working on improving the Swift compiler's code generation for sources that include unavailable declarations. Declarations that are marked unavailable with the @available
attribute are meant to be unreachable at runtime since the type checker only allows references to unavailable declarations inside declarations that are also unavailable:
@available(*, unavailable)
struct Unavailable {
func foo() {
print("Supposedly unreachable")
}
}
Theoretically this means that machine code for unavailable declarations should not need to be emitted into the final binary. However, the existing compiler does not consider unavailability during code generation and as a result unavailable declarations are compiled normally, bloating the resulting binaries. Fixing this could significantly improve the code size of large cross-platform libraries, but of course nothing is ever as simple as it seems. While investigating fixing this I have found a long tail of type checking and code generation flaws that inhibit the optimization from being viable. So far I've been able to add missing diagnostics to the type checker or improve the correctness of generated code to resolve the issues. Today, though, I discovered a flaw related to Codable
synthesis that has potential implications on errors thrown by decoders and I wanted to discuss how to address it with the community since the best fix might involve introducing a new case to DecodingError
.
The problem
Enum declarations can contain unavailable cases. If you add an associated value to such a case and allow the compiler to synthesize a Codable
conformance for the enum, with the Swift 5.8 compiler you're able to bypass type checking and instantiate an instance of an unavailable type at runtime if you craft the right payload for decoding:
import Foundation
@available(*, unavailable)
struct Unavailable: Codable {}
enum EnumWithUnavailableCase: Codable {
case a
@available(*, unavailable)
case b(Unavailable)
}
let json = #"{"b":{"_0":{}}}"#
let encoded = json.data(using: .utf8)!
let decoded = try JSONDecoder().decode(EnumWithUnavailableCase.self, from: encoded)
// Oops, prints "b(main.Unavailable())"
print(decoded)
It's especially easy to imagine how this issue might occur for a cross platform app. The app could have an enum that represents platform specific concepts and some of the cases could be, for example, unavailable on iOS but available on macOS. The macOS and iOS variants of the app might communicate with each other with messages that include encoded representations of these platform specific values, and unexpected behavior could occur in the iOS app after decoding one of these messages.
Regardless of whether we want to optimize code size, the behavior that is reproducible with the example above needs to be fixed since allowing unavailable code to be executed defies programmer expectations and introduces a kind of undefined behavior. The question is what should happen at runtime instead.
Potential solutions
When decoding an enum value, a hand-written Decodable
conformance would probably throw an error if a value of an unavailable case were encountered. The DecodingError
type in the standard library currently has .typeMismatch
, .valueNotFound
, .keyNotFound
, and .dataCorrupted
cases. Of these cases, the only one that seems vaguely appropriate to describe an unavailable value is .dataCorrupted
but it feels a bit misleading. The data represents a value that is incompatible with the current runtime, rather than being fully unrecognized. Some programs might want to catch an error that is specific to this situation and ignore the corresponding value but otherwise treat the condition as non-fatal.
My initial inclination is to propose the addition of a new DecodingError
case to represent this failure:
public enum DecodingError: Error {
// ...
/// An indication that the data represents a value that is not representable
/// at runtime because it has been marked unavailable with `@available`.
///
/// As an associated value, this case contains the context for debugging.
case unavailableValue(Context)
}
I'd like to get some preliminary feedback from the community on this approach to the problem. Does a new error seem necessary for this case? Or are there any alternative solutions you'd consider?