Introduce an "ExpressibleByCase" protocol (to facilitate struct-based enums with payload)

Charles_Constant · September 11, 2020, 5:15am

Hello. Since forever, there has been disagreement over how enum payloads should work.

Rather than keep bike-shedding about enums, we should make structs able to declare cases. That means programmers can create enum-like datatypes that behave however they like. We add a magic protocol:

protocol ExpressibleByCase {
    init( `case`: String, arguments: (name: String, value:Any) ... )
}

The programmer can now create an enum-like struct like this...

struct Param: ExpressibleByCase {
    case age(Date),  volunteer,  fullname(first:String,last:String) 
    init( `case`: String, arguments: (name: String?, value:Any) ... ) {
        /* code to store 'case' and 'arguments' how one sees fit*/
// ...

Usage at the call-site:

let foo: Param = .fullname(first:"Jane",last:"Doe")

...upon which, Swift calls our init method:

    init( `case`: String, arguments: (name: String?, value:Any) ... ) {
        print(`case`)    //  "fullname"
        print(arguments) //  [ ("first","Jane"), ("last","Doe") ]

Programmers easily can customize the rest, on their own: how to store the case id and payload data, and whether/how to conform to Equatable and CaseIterable.

Charles_Constant · September 11, 2020, 5:28am

A couple more thoughts...

We could also provide an alternative that isn't stringly-typed:

/* Swift supplies init args by the position in which they are declared */
protocol ExpressibleByCaseIndex {
    init( `case`: Int, arguments: Any ... )
}

Maybe that's better in terms of performance, and prevention of typos.

Also, my preference is to allow cases to be usable with, or without their arguments: so we allow both .foo and .foo(123) and let the programmer decide how to handle that in init. If we want to give the programmer the option to allow/forbid this, maybe we could use a new syntax for declaring cases. Eg:

struct Param: ExpressibleByCase {
    case 
        height(Int)?,  // accepts .height(123)  *or*  .height
        length(Int)    // accepts .length(123)  *only*

This would be the missing piece that makes a paradigm for accessing an enum payload (that the introduction of @dynamicCallable made possible), ergonomic.

Everyone has different tastes, but to mine, the struct paradigm is more convenient than Swift enums-with-payload.

jawbroken · September 11, 2020, 1:14pm

You can already get this exact call-site:

struct Param {
  static func age(date: Date) -> Param { Param() }
  static var volunteer: Param { Param() }
  static func fullname(first: String, second: String) -> Param { Param() }
}

let p1: Param = .age(date: Date())
let p2: Param = .volunteer
let p3: Param = .fullname(first: "Jane", second: "Doe")

and if you add static var age: Param then you can freely use either .age or .age(date:) and store them however you like. So you probably should explain what other enum-like functionality you want the resulting struct to have.

Charles_Constant · September 11, 2020, 1:45pm

The example you use is the basis of my comments in the thread I linked.

The trouble is: it's verbose and ugly... and, like, absurdly so! Your example, is (understandably and mercifully) shorthand. As you probably realize, it leaves out two things:

That each case needs to init the struct with a unique id for its case. Without this id, there is no way to tell if two value types represent an equal case.
That each example func needs to pass along its arguments. This is so because we need to store the payload.

Here's a reasonable approximation of what a full example might look like (leaving out two init methods, to avoid introducing errors by retyping the String id twice):

struct Param {
  static let age = Param( "age" )
  static let volunteer = Param( "volunteer" )
  static let fullname = Param( "fullname" )

  static func age(date: Date) -> Param { Param( .age,date }
  static func fullname(first: String, second: String) -> Param { Param(.fullname,second) }
}

Now there's even more machinery we need to add to make this a usable enumerable type (the inits, tests for equality, payload getter, code to make it confirm to CaseIterable), but already, this is pretty ugly and repetative.

Another concern I have is that we're relying on overloading a var with a func. If the data type were simple, maybe that would be okay, but I wouldn't want to debug it, if something goes wrong.

All this mess and confusion might be alright for a single DataType in one file somewhere, but I can't see anyone being happy with repeating this pattern over and over again. That's a shame, because it works great, at the call-site.

Charles_Constant · September 11, 2020, 1:52pm

One further consideration is that, were we to make this protocol official, we would have the option, down the road, for the Swift compiler to automatically code-complete payload retrieval (ie: since the payloads will be stored as Any 99% of the time, we need to typecast it at the call-site. Theoretically, if the code has a statement that checks which case a variable is, the compiler could check the init to retrieve the correct Type to which its Payload can cast). It's an easier model if the only cases the compiler need look up are labeled as 'case'. Without that, we have to cross-reference static vars with their overloaded static methods.

jawbroken · September 11, 2020, 2:11pm

Sure, there's a little overhead, but you equally left out a similar or worse mess inside your initialiser where you're going to have to parse a bunch of strings, try to cast all your arguments from Any, etc. It's all going to be roughly equivalent in the end if you want the ability to customise how these are stored and whether payloads are required or not (and not just magically generate a case tag and a bunch of optional variables). I think you're exaggerating the requirements a little though:

struct Param: Equatable {
  static let age = Param(.age)
  static func age(date: Date) -> Param { Param(.age, date: date) }
  static let volunteer = Param(.volunteer)
  static let fullname = Param(.fullname)
  static func fullname(first: String, second: String) -> Param { Param(.fullname, first: first, second: second) }
  
  enum ParamTag: Equatable { case age, volunteer, fullname }
  
  let tag: ParamTag
  let date: Date?
  let first: String?
  let second: String?
  
  init(_ tag: ParamTag, date: Date? = nil, first: String? = nil, second: String? = nil) {
    self.tag = tag; self.date = date; self.first = first; self.second = second
  }
}

There's some repetition but you get to fully customise what is allowed at the call site and how it is stored, and you don't have to invent new ways for people with different preferences to do this customisation.

Charles_Constant · September 11, 2020, 3:25pm

That's true only where checking sanity is concerned. Even that is a shame, because normally, it would be Swift calling the init method, not custom code, and therefore guaranteed that the init arguments be valid). We don't have any access control more restrictive than "private" so I suppose that's just life.

Aside from a sanity check, the rest is simple. Here's a serviceable example (without sanity check) where the programmer opts to have the static func submit arguments as a Tuple:

@dynamicCallable
struct Param: ExpressibleByCase {

    let `case`: String
    let payload: Any?

    init( `case`: String, arguments: (name: String?, value:Any) ... ){
        self.case = `case`
        self.payload = arguments.first?.value
    }

    func dynamicallyCall<T>(withArguments args: [Int]) -> T? {
        payload as? T
    }

}

You're correct that the user must cast the Payload to retrieve it. Granted that everybody has different tastes, but I find having to cast the payload at the call-site, in practice, is no more onerous and confusing than having to cast in a switch statement, which we already have to do. In fact, in practise, aside from having to manually look up the Type, it's less onerous, and less confusing. Anyways, as I hinted at earlier, in future, Swift could probably do code completion, as long as there's been a check against the case, since it can look up the case's arguments on behalf of the coder.

Back to that "sanity check", which is admittedly a problem with my proposal, it seems like there might be some way to work around it. I don't know if there's any special variable like #file that could be used to tell who called it, or any kind of solution via access control 'private' or what. Since it's only an issue if the init is used in a funny way, it seems like a shame that that be a sticking point.

Hmm. One possible solution for the "sanity check" problem, albeit perhaps not a very efficient one, would be for Swift to automatically provide a list of case and argument identifiers.

The the init could check if the arguments it is passed are present in that Swift-provided dict. Eg (assuming the dict is called 'schema'):

guard Self.schema[`case`]?.contains(argName) ?? false else { fatalError() }

Edit: Oops. I just realized something. When I wrote this...

I was was thinking of the way my existing code (ie: with the overloaded vars and funcs) works. Under the proposal, as I wrote it in this post, that would not be possible (because even if we change the init protocol to accept Any? to hand it a Tuple, instead of an Collection of Any, there's no such thing as a single element Tuple). And since there's not exactly a plethora of options for creating Swift tuples, the init would not be that simple :(

Ben_Cohen · September 11, 2020, 3:37pm

Hi @Charles_Constant – can you expand a bit on what you mean by this?

And more on why it's important to increase the complexity of the language by adding this feature?

What are the problems you are trying to solve, that you believe Swift programmers face today, that justify adding new syntax to the core language? Can they be solved today in other ways, and how are those ways deficient?

toph42 · September 11, 2020, 4:06pm

Isn’t it simple enough to just put an enum of your cases inside a member of the struct?

Charles_Constant · September 11, 2020, 4:10pm

Sure, by "disagreement over how enum payloads should work" I was thinking about various proposals to deal with if case let (caution: vulgar language). I don't have my finger on the Evolution pulse, though, so apologies if I mischaracterized the situation.

There are, specifically, two ends:

Get at an enum payload with greater ease and flexibility than using a "switch" statement, or "if case let" statement. Afaic, let payload: (Int,Double) = mycase() would be a pretty welcome replacement.
Permit each enum case to be compared with or without arguments. Eg:

.foo      ==  .foo(123) // true
.foo(123) ==  .foo(456) // true
.bar      ==  .foo(123) // false

.foo(123) === .foo(123) // true
.foo(123) === .foo(456) // false

I probably should have led with that information, hey? These two problems have been my top two annoyances in Swift for the longest time. Since I don't see existing Swift enums ever adopting these solutions to them (source-breaking, and perhaps just my impression, previous proposals seem not ot go anywhere), I figure maybe we could just go with the building blocks to let people build their own?

Charles_Constant · September 11, 2020, 4:13pm

Hi Topher. The point is to have an alternative to an enum, though. That it be a struct is solely to allow the programmer to store the case name and its payload. Other than that, we still want enum syntax (ability to use switch statements, compare a variable against a case, and so on). Wrapping an enum in a struct is probably the worst of both world? I mean, you'd have the unpleasantness of extracting enum values plus the overhead of the wrapper? To be fair, there are other ways to get what I want (as I touched on earlier, I'm using one, already), but they are hard to understand, verbose and require code duplication.

toph42 · September 11, 2020, 4:26pm

But what if you focus on sugar to allow you to directly access cases of the wrapped enum on the wrapper struct rather than duplicate enum functionality on structs?

Charles_Constant · September 11, 2020, 4:32pm

I may be mistaken, but think it amounts to the same thing? In the end, to make it possible for the same Type to accept both .foo and .foo(doub,str) we need to create both a "static var foo" and overload it with "static func foo(args...)" With that constraint (the need to write both a var and a func for every case), the code to create the Type is always going to be gnarly (in a bad way, not a 1980s way). That was largely what drove me to suggest this protocol. Now, a wrapped enum would help to avoid the id, but that also requires a lot of duplication, because now we have to create a static var in the struct (and the static func) and an enum case to use. That's safer than a String, but the result is complicated with a bunch of repetition.

CTMacUser · September 12, 2020, 8:24am

I've been thinking of making cases for structs for years now, but for another reason: strong type aliases. The enum and struct value types are practically interchangeable with the lack of restrictions on various members that can be defined within them. Having cases on product types is the only (major?) asymmetry. It's going to be less of a concern after protocols get tweaked to allow cases to be protocol witnesses for type-level properties/methods; something me and others pushed for since cases and type-level members have the same interaction model.

Hasn't it just been you, and only for the past week? For those that haven't seen the other thread, @Charles_Constant wants enum instance equivalence to be via the tag only, while we currently expect both the tag and all the payload properties (if any) to be considered.

But we can't have struct cases use a different matching philosophy. If they did, then switching an enum type to a struct type would be a non-transparent change. Worse, the difference is not syntactic, but only happens at runtime. This wouldn't be acceptable.

So, rethinking on how I would design it....

FIrst, it wouldn't be a protocol, but a new built-in. It's just an extension of an existing declaration:

struct myStruct {
    /* Insert the actual members here */

    case mySingularCase {
        init {
            // This is a code block that returns an instance of `Self`.
        }
        match {
            // This is a code block that returns a `Bool`.
            // When `true`, a `switch` will use this case.
        }
    }
    case myPayloadCase(Int, whatever: Double) {
        init {
            // This is a code block that returns an instance of `Self`.
            // It can use `$0` and `$1` from the input tuple to create the result.
        }
        get.0 {
            // This is a code block that returns an `Int`.
            // It should use the instance's actual properties for the computation.
        }
        get.1 {
            // This is a code block that returns a `Double`.
        }
        match {
            // This is a code block that returns a `Bool`.
        }
    }
}

"match" is a new contextual keyword. Theoretically, multiple cases can match; the lexically-first match wins, just like in normal enum types. The init and match blocks are required, while the get.# blocks are required for each payload member.

I just came up with this for struct types, but realized that we could let class types participate in this too. They just need one more required block:

class myClass {
    /* Insert the actual members here. */

    case myCase(Bool) {
        init { /*...*/ }
        get.0 { /*...*/ }
        match { /*...*/ }
        set {
            // This is a code block that mutates the appropriate properties of `self`.
            // It can use `$0` to determine the new state(s).
        }
    }
}

For orthogonally, we could support enum types getting synthetic cases too! (They would support init/get.#/match.)

Synthetic cases can't share the same name. (That's both base name and payload signature; although I think there's still an outstanding bug where cases can't be made that differ only in the payload.) For enum types, a synthetic case can't share a name with a natural case.

Natural cases must be declared in their type's primary definition. I think keeping that for synthetic cases would leave the primary definition too crowded. I think that we at least should allow synthetic cases to be defined in extension blocks in the primary file too. Maybe in other files of the same module too. Double-maybe in other modules too (if the source type is public, of course). Triple-maybe synthetic cases from outside modules can themselves be publicized.

Synthetic case search is the lexical order in the primary definition first, then lexical order in extensions in the same file (flattening multiple extensions to one list). For other files (if we do that), it's flattened extensions lexical order, but the relative order between files is unspecified. (And same module is considered before outside extensions.)

I came up with match blocks last. It avoids use from having to make types with synthetic cases to define a instance-level property that returns a token representing something akin to an enum case tag.

Maybe let payload = myEnumInstance as? case .myCase?

Here, we need to define a standard library EnumerationCase type. Its internals are opaque to users. It'll be the return to the global case(of:) function. This function will let you do tag-only comparisons. Note that once the no-shared-base-name bug for cases is fixed, .foo and .foo(Int) will be allowed at the same time, but will be different cases and therefore have distinct tags.

Charles_Constant · September 12, 2020, 12:53pm

Hi there. That's a novel solution, but it wouldn't work for my needs.

   case myCase(Bool) {
        init { /*...*/ }
        get.0 { /*...*/ }
        match { /*...*/ }
        set { /*...*/ }
    }

This, in particular, adds a lot of code. What if the struct has 30 cases? Worse yet, it's code that is impossible move to a protocol extension (both because it involves a getter, and also because case names are different for every new enum type).

If we go with ExpressibleByCase, we get the ability to create custom protocols that conform to it ("it" being ExpressibleByCase). Such protocols could define an appropriate init, payload getter, and funcs to check equality. What this means, is that the entirety of the required code for a struct/enum might look like this:

struct Param: CustomExpressibleByCase {

    let raw: Case // typealias from proto for (case:String,payload:Any?) tuple

    case foo(Double), bar, baz(arg1:String,arg2:Int)

}

It might be less flexible (than your solution), but it provides a better experience (than existing Swift enums) at the call-site, while being nearly as succinct (as existing enums).

If that is the case (and it may well be) then I regret my wording. I recall discussion about enums after the Swift public beta, but I haven't really kept tabs on such discussion the past couple years. I made the lazy assumption that, since the syntax to access enum payloads is unchanged, it was still a hot topic.

toph42 · September 12, 2020, 2:06pm

You can’t even do that with enum now. I thought you were trying to get feature parity for cases in enums and structs with cases just so you could also have member vars, but this would mean that struct would have cases impossible for enum. Would you want enum to be able to have cases that are duplicates besides their associated values at the same time as adding this feature to struct?

Charles_Constant · September 12, 2020, 2:17pm

One problem is that that would be source-breaking.

Another problem is that, however unlikely it be that I manage to get my "overloaded case" struct proposal into Swift, it is astronomically less likely that I get an "overloaded case" enum proposal into the language. To get support for altering a preexisting feature, you have to overcome Stockholm Syndrome. Once someone is used to a feature, no matter how crap it is, it can seem like "actually that's a feature" (especially after someone has invested the energy to memorize "if case let" syntax)

If it weren't for those two problems; you bet! I'd scrap this struct proposal in a heart-beat, and support adding "overloaded cases" to enums, instead. The only benefit a struct version really adds is flexibility for a programmer to create a model that works for them.

toph42 · September 12, 2020, 8:12pm

Others here are saying that it should work and the fact that it doesn’t is a bug.

Are bug fixes allowed to be source-breaking?

Quick aside on Stockholm Syndrome. I recently learned it’s not an actual thing but a misogynist construct made up to discredit victims, so even though I colloquially understand your application of it here, maybe don’t continue using that metaphor.

That said, does that same inertia apply if said “feature” is a bug?

Charles_Constant · September 12, 2020, 9:36pm

Hmm, promising! It will take me a couple days to process that mentally.

Until now, I mainly knew the phrase via movies about the famous robbery. I avoid controversial language, both to spare feelings and avoid derailing the reader's attention. Thanks for alerting me.

CTMacUser · September 12, 2020, 11:25pm

I'm the one who brought up both case-naming bug and the source-breaking issues. Those aren't contradictory; @Charles_Constant disagrees on how case matching should work.

Let's say that enum instances are implemented as an (Int, Any) tuple, where the integer is the tag of the instance's case and the Any member stores the (possibly Void) payload.

enum MyEnum {
    case foo
    case bar
    case foo(Float)
}

Here, we're assuming the cases-must-have-unique-base-names bug is fixed. The first case gets a tag of 1, and the second gets a tag of 2. Swift would assign the third case to have a tag of 3, while @Charles_Constant wants it to have a tag of 1. In other words, the payload's signature, let alone its value, wouldn't be part of how two instances are considered in the same state or not. That would break the assumptions on how cases currently work.

If we add struct cases, we could introduce the payload-agnostic comparison as a point of difference from enum cases. But I say that wouldn't be useful; in fact, it would be dangerous. Switching a type from enum to struct would create obnoxiously different runtime behavior. (Differing enum cases with the same base name will become "equal" struct cases.). We shouldn't introduce that inconsistency.

(Even without backwards compatibility, my opinion is that tag-only comparison is generally a lot less useful than full comparison for most users, and there shouldn't be any mode where tag-only comparison is the default.)

If you just want case comparison, we can introduce an opaque EnumerationCase type that's Equatable (at least) and a global function case(of:) -> EnumerationCase? to extract the tag. To represent tags, we could use key-path syntax: \MyEnum.myCase. Flipping comparisons to tag-only would require users that want to fully compare two enumeration cases to get the payload as Any and guess the right type of the tuple to dereference.

(The case(of:) function would return nil for case-less types or when your code doesn't have the right access level. The latter would happen if the enum is non-frozen and new cases were introduced.)