TL;DR
- Public protocols can't have internal customization points, and it's a pity.
- The visitor pattern does not always scale well.
- Enums rock.
During the many years of development of GRDB the SQLite library, I had to support many kinds of "expressions" (values, columns, compound expressions such as Column("a") < Column("b")
, and many others).
I went through several refactorings, and settled on:
- A public but opaque type that wraps a private enum, with internal factory initializers and instance methods.
- A public protocol that can build the opaque type from any other type (user types, GRDB types, as well as standard types when relevant).
// Public opaque type
public struct SQLExpression {
// Private implementation
private var impl: Impl
private enum Impl {
case value(...)
case column(...)
...
}
// Internal factory methods
internal static func value(...) -> Self {
self.init(impl: .value(...))
}
internal static func column(...) -> Self {
self.init(impl: .column(...))
}
...
}
extension SQLExpression {
// A lot of internal instance methods
internal func frobnicate(_ context: FrobnicationContext) -> Frobnication {
switch impl { ... }
}
}
// Public protocol that produces SQLExpression
public protocol SQLExpressible {
var sqlExpression: SQLExpression { get }
}
// Public expressible types
public struct Column: SQLExpressible {
var name: String
var sqlExpression: SQLExpression { .column(name) }
}
extension Int: SQLExpressible {
var sqlExpression: SQLExpression { .value(self) }
}
The role of the factory initializers is mainly to prevent the creation of invalid values of the private enum. Some cases have invariants that can't be expressed by the enum type itself (such as a case whose payload is an array that must not be empty, for example).
That's a lot of boilerplate, but I'm pretty happy with it.
The pollution of public apis is minimal (an sqlExpression
property that does not bother anybody). All other apis have the desired visibility (public, internal, private).
What did I try before? A lot of painful solutions
Why not a public protocol with one type per case?
For two reasons:
-
If you don't want to cast at runtime (case let value as Column
, etc.), with the risk of forgetting a case, then you have to rely on customization points.
Public protocols can not define internal customization points. So the frobnicate
method above has to be public, as well as the FrobnicationContext
and Frobnication
types.
The generally accepted technique for public apis that should have remained internal is to prefix their names with underscores: _frobnicate
, _FrobnicationContext
... The problem is that those ugly underscores proliferate, and you end up with a lot of public types for no good reason.
None of those problems happen with the enum-based technique suggested at the beginning: all apis that should remain internal remain internal.
-
In my particular case, it wanted to compare all the various implementations of _frobnicate
across the types that implement it. Due to the file-scoping of private and fileprivate apis, all those implementations were dispatched across many files, and I it was difficult to compare them in a glance.
Again, this problem does not happen with the enum-based technique: all implementations are a big switch
on the cases of the private enum, easy to navigate.
Why not the visitor pattern?
Some of the problems of public visibility expressed above still apply, but at least MOST of public/internal methods above remain internal:
public protocol _SQLExpressionVisitor {
mutating func visit(column: Column)
mutating func visit(value: ...)
...
}
public protocol _SQLExpression {
func _accept<Visitor: _SQLExpressionVisitor>(_ visitor: inout Visitor)
}
// Public expressible types
public struct Column: _SQLExpression {
var name: String
public func _accept<Visitor: _SQLExpressionVisitor>(_ visitor: inout Visitor) {
visitor.accept(column: self)
}
}
extension Int: _SQLExpression {
public func _accept<Visitor: _SQLExpressionVisitor>(_ visitor: inout Visitor) {
visitor.accept(value: self)
}
}
// Internal apis
extension _SQLExpression {
internal func frobnicate(_ context: FrobnicationContext) -> Frobnication {
var visitor = Frobnicator(context: context)
_accept(&visitor)
return visitor.result
}
}
private struct Frobnicator: _SQLExpressionVisitor {
var context: FrobnicationContext
mutating func visit(column: Column) { ... }
mutating func visit(value: ...) { ... }
...
}
The visitor pattern is type-safe, there is no dynamic cast.
The amount of underscored apis is reduced.
All implementations for a given method can be located in the same file (the slight cost is that visited types must sometimes expose some private stuff).
So what's the problem with the visitor pattern?
The visitor pattern does not deal very well with type hierarchies. The GRDB SQLite wrapper does not only deal with SQL expressions, but also with selections, ordering terms, and various other members of the SQLite grammar. It happens that in order to deal with selections (such as the "*"
in "SELECT * ..."
), one must also deal with all expressions ("SELECT name, score + 1 ..."
). This creates a hiearchy in the visitor protocols:
public protocol _SQLSelectionVisitor: _SQLExpressionVisitor {
// All selection-specific visiting methods (plus inherited ones for expressions)
mutating func visitAllColumns()
...
}
Practically speaking, selection visitors have the same implementation for most expression-visiting methods, with a few exceptions. Repeating the same code again and again smells, so you try to define a convenience default implementation for all expressions:
public protocol _SQLSelectionVisitor: _SQLExpressionVisitor {
mutating func visitAllColumns()
...
// Convenience for all expressions
mutating func visit(expression: _SQLExpression)
}
// Default implementations
extension _SQLSelectionVisitor {
mutating func visit(column: Column) { visit(expression: column) }
mutating func visit(value: ...) { visit(expression: value) }
...
}
private struct SomeSelectionVisitor: _SQLSelectionVisitor {
// Thanks to the convenience, all expressions are visited here.
mutating func visit(expression: _SQLExpression) { ... }
}
But if one or two expression types require a special processing, the compiler will not tell you, because the catch-all convenience method fulfills the compiler needs. It is thus very easy to forget. Of course, one can rely on tests, but the compiler is just faster to tell that something's wrong than runtime tests: nothing beats static checks.
So in the end I stopped adding the convenience catch-all method, and ended up with awfully long lists of identical or nearly identical methods.
Again, this problem does not happen with the technique of the private enum described at the beginning.
As a conclusion: enums made me happy, perhaps they will do the same to you.