Brainstorm: Improving syntax for method implementations on enums

Background

There is some overlap between the functionality of polymorphism and that of enums.

Example

As a contrived example:

enum AnimalEnum {
    case dog
    case cat

    func makeNoise() {
        switch self {
            case .dog: print("Woof")
            case .cat: print("Meow")
        }
    }
}

vs.

protocol AnimalProtocol {
    func makeNoise()
}

struct Dog: AnimalProtocol {
    func makeNoise() { print("Woof") }
}

struct Cat: AnimalProtocol {
    func makeNoise() { print("Meow") }
}

(In this case AnimalProtocol is almost a certainly a better choice, but this is a synthetic example for demonstration only.)


The both provide the ability to express a "one choice out of many" situation, but with different trade-offs.

Trade-offs

There are some pros/cons to each:

Enums

  1. Exhaustivity vs Extensibility:

    1. :heavy_plus_sign: Enums prevent others from adding new cases, so client code can exhaustively handle all cases (particularly for @frozen enums).
    2. :heavy_minus_sign: Enums aren't extensible, so they restrict possibilities for client code to add their own behaviour.
  2. :heavy_plus_sign: Syntax:

    1. :heavy_plus_sign: Enums explicitly state all cases in one place.
    2. :heavy_minus_sign: Practically every method of an enum will need to do some kind of pattern matching (equality checking, switch cases, if case, guard case, etc.) statements to be able to know what to do. Enums force you to glue all method implementations together. E.g. makeNoise() contains the makeNoise() implementations of both the dog and cat case. There's no way to say "put all the dog implementations here, all the cat implementations there, etc.
  3. Performance

    1. :heavy_plus_sign: Enum have a really efficient memory layout (assuming the cases have similar sizes), directly on the stack.
    2. :heavy_plus_sign: Calls directly to methods on enum method can always be statically dispatched, which unlocks a bunch of other optimizations (most importantly: inlining).
  4. :heavy_minus_sign: Cases of an enum aren't their own unique type. E.g. if you have an enum value whose value you know is .cat (because you checked it), you can't pass it to a function as a Cat, only as an AnimalEnum , where that knowledge of it necessarily being a .cat is lost.

Polymorphism:

  1. Exhaustivity vs Extensibility:
    1. :heavy_minus_sign: Exhaustivity of "cases" (subtypes) can't be enforced, apart from the comparatively broad granularity of control that access levels grant you.
    2. :heavy_plus_sign: Very extensible. It's the go-to tool for introducing seams for extensibility.
      • :heavy_plus_sign: (in the case of subclasses) "cases" can be subclassed even further. So new "cases" that you define don't just have to be brand new types, they can be refinements of existing types.
  2. Syntax:
    1. :heavy_minus_sign: Finding all conforming types can be tricky. But luckily, it's seldom required.
    2. :heavy_plus_sign: Syntax: "Cases" (what I'll call protocol-conforming types and subclasses from now on) can have standalone implementations, implemented across separate files, folders or even modules.
      • :heavy_plus_sign: Implementations of methods (and initializers, properties, subscripts) for every type are independent, so they're short, sweet and highly focused
  3. Performance
    1. :heavy_minus_sign: Protocols have a comparatively inefficient memory layout. Non-"class" protocols require boxing values into existential containers (which are a fixed size of 5 words). If the conforming values are structs that are small enough (of size x), they can be stored inline in the existential container (with 3 - x words wasted). Worse, if they're too large, they have to be heap allocated with a reference stored in the container. This wastes 2 words of storage, and can thrash the CPU cache and increase the latency of look-ups.
    2. :heavy_minus_sign: Polymorphism requires dynamic dispatch. There are many cases that can be optimized, but not the general case of a subclassing a class for conforming to a protocol, from a different module.
  4. :heavy_plus_sign: Cases are modelled by types, which can be passed around. Once I know I have a Cat, I can pass a Cat, and have full access to the API of Cat (not just those required by the AnimalProtocol).

The problem:

Point 2.2 in the trade-offs above. There's practically no way to write a useful method on an enum without switching on it, or doing some other kind of pattern matching. Syntactically, your enum method blocks become huge, your method implementations don't have a single responsibility.

Case Study: Java

Java case study

As a little known feature, Java's enums support abstract methods. Enums define abstract methods, which must be implemented by every case. This allows short, sweet and focused method bodies: if you see a method body in the DOG case, you know you're working with a DOG. No need to check for CAT, and clutter the implementation for DOG with a cohabiting implementation for CAT.

public enum Level {
    DOG {
        @Override public void makeNoise() {
            System.out.println("Woof");
        }
    },
    CAT {
        @Override public void makeNoise() {
            System.out.println("Meow");
        }
    };

    public abstract void makeNoise();
}

Questions:

  1. Is this enough of a problem that we should do something about it? I think so.
  2. How should it be designed? idk.
1 Like

Your Java example seems far worse than Swift. In general it seems like you'll only save lines with something like this if you have a lot of methods and a lot of cases. This also seems like a feature that would have implications beyond enums.

It's not a design I'm proposing, just an idea of what one solution to this problem looks like.

It probably won't even save lines, but that's not the point. It's that you don't have cohabitation of completely unrelated method implementations, just because they have to share the same switch "roof".

What do you have in mind?

I think you could do something like this:

protocol Functioning {
  func doFunction()
}

struct FirstProxy: Functioning {
  func doFunction() {
    print("first set")
  }
}

struct SecondProxy: Functioning {
  let aNum: Int
  func doFunction() {
    print("second - \(aNum) set")
  }
}

enum SomeExhaustiveEnum {
  case first(FirstProxy)
  case second(SecondProxy)
  var proxy: Functioning {
    switch self {
    case .first(let proxy), .second(let proxy);
       return proxy
    }
  }
}

Would something like this solve what you're looking for?

Then you can get the benefits of being exhaustive without a switch case on every single instance of a value for calling a function.

I'm on mobile and away from my laptop, so my apologies for any logic and formatting errors.

I think that you're looking for something like Scala's "sealed traits". They're basically a way to initially declare the full set of types that conform to a protocol, and none can conform afterwards.

I wonder if we would have some stronger capabilities with generic values being added to Swift?

I could see the day that this would be valid syntax:

enum AnimalEnum {
    case dog, cat

    func makeNoise() {
        // I don't know what I am yet!
    }

    func makeNoise() where self == .dog {
        // I am a dog
    }
}
2 Likes

Yeah, I've done something like this in the past. It's decent, but I think a better enum syntax could improve it.