i’ve been struggling to create a basic BSON encoding DSL in swift for several days now, and i just can’t shake the feeling that this is impossible and that swift is just really bad at DSLs.
here’s my problem:
i have a protocol BSONEncodable
, which has some of the usual conformers:
protocol BSONEncodable
{
}
extension Int:BSONEncodable
{
}
extension String:BSONEncodable
{
}
extension Optional:BSONEncodable where Wrapped:BSONEncodable
{
}
i also have an encoding container, UniversalBSONDSL
, which for the purposes of this example, just looks like:
struct UniversalBSONDSL
{
init()
{
}
init(with populate:(inout Self) throws -> ()) rethrows
{
self.init()
try populate(&self)
}
}
UniversalBSONDSL
models a document, and documents can contain other documents, so UniversalBSONDSL
is itself BSONEncodable
.
extension UniversalBSONDSL:BSONEncodable
{
}
finally, UniversalBSONDSL
vends a key-value pair-based encoding interface through some instance subscripts. here’s a simplified schematic:
extension UniversalBSONDSL
{
subscript<First, Second>(key:String) -> (First, Second)?
where First:BSONEncodable, Second:BSONEncodable
{
get { nil }
set { fatalError() }
}
subscript<Encodable>(key:String) -> Encodable?
where Encodable:BSONEncodable
{
get { nil }
set { fatalError() }
}
}
it can be used like this:
let _:UniversalBSONDSL = .init
{
$0["$abs"] = 1
}
let _:UniversalBSONDSL = .init
{
$0["$add"] = (1, "$field")
}
where this all falls apart is the recursive case. you see, aggregation expressions (e.g. $abs
) can contain other expressions, so there needs to be an easy way to nest these expressions. for example, we want to be able to encode something like the following JSON:
{
$add: [ 1, { $abs: 1 } ]
}
and this is where swift completely falls flat, because even with SE-0299, this use case just doesn’t work.
to start, SE-0299 doesn’t work with init
s, so even if you move the definition of init(with:)
to a protocol extension block, it still doesn’t compile:
extension BSONEncodable where Self == UniversalBSONDSL
{
init(with populate:(inout Self) throws -> ()) rethrows
{
self.init()
try populate(&self)
}
}
let bson:UniversalBSONDSL = .init
{
$0["$add"] = (0, .init
{
_ in
})
}
encodable.swift:77:9: error: generic parameter 'Encodable' could not be inferred
$0["$add"] = (0, .init
^
encodable.swift:39:5: note: in call to 'subscript(_:)'
subscript<Encodable>(key:String) -> Encodable?
^
encodable.swift:77:22: error: cannot assign value of type '(Int, _)' to subscript of type 'Encodable'
$0["$add"] = (0, .init
^~~~~~~~~
encodable.swift:77:27: error: cannot infer contextual base in reference to member 'init'
$0["$add"] = (0, .init
but SE-0299 does work with static methods, at least superficially, probably because it is implemented in terms of things that return Self
(which apparently init
is not one of). so this miraculously does compile:
extension BSONEncodable where Self == UniversalBSONDSL
{
static
func document(with populate:(inout Self) throws -> ()) rethrows -> Self
{
try .init(with: populate)
}
}
let bson:UniversalBSONDSL = .init
{
$0["$add"] = (1, .document
{
_ in
})
}
but now this is where i run into a lot of weirdness, because the minute i try to actually do something with the closure parameter, it stops compiling:
let bson:UniversalBSONDSL = .init
{
$0["$add"] = (1, .document
{
$0["$abs"] = 1
})
}
encodable.swift:86:9: error: generic parameter 'Encodable' could not be inferred
$0["$add"] = (1, .document
^
encodable.swift:39:5: note: in call to 'subscript(_:)'
subscript<Encodable>(key:String) -> Encodable?
^
encodable.swift:86:22: error: cannot assign value of type '(Int, _)' to subscript of type 'Encodable'
$0["$add"] = (1, .document
^~~~~~~~~~~~~
encodable.swift:86:27: error: cannot infer contextual base in reference to member 'document'
$0["$add"] = (1, .document
~^~~~~~~~
the closure parameter needs a type annotation:
let bson:UniversalBSONDSL = .init
{
$0["$add"] = (1, .document
{
(bson:inout UniversalBSONDSL) in
bson["$abs"] = 1
})
}
but this syntax is just awful. and it’s really strange that using the $0
parameter breaks type inference, because when i hover over the _
-bound one in VSCode i can see that the compiler really did choose the correct overload.
ideally, i would be able to something like the following:
let bson:UniversalBSONDSL = .init
{
// A
$0["$abs"] = "$field"
// B
$0["$abs"] = .init
{
$0["abs"] = "$field"
}
// C
$0["$add"] = (1, "$field")
// D
$0["$add"] = (1, .init
{
$0["$abs"] = 1
})
}
A works. B doesn’t work out of the box, but can be made to work in a scalable fashion by vending init(with:)
on Optional<UniversalBSONDSL>
.
C works, but D doesn’t, and the usual workaround — vending a concretely-typed subscript overload — isn’t effective because aggregation operators can take up to four arguments, and the number of subscript overloads required is exponential in the number of concrete types that need to be special-cased.
in the absence of a consistently-behaving SE-0299, it seems the best we can do is either
$0["$add"] = (1, UniversalBSONDSL.init
{
$0["$abs"] = 1
})
or
$0["$add"] = (1, .document
{
(bson:inout UniversalBSONDSL) in
bson["$abs"] = 1
})
and both of these syntaxes are awful when all we wanted to write was
{
$add: [ 1, { $abs: 1 } ]
}