Hello community!
I’d like to pitch an idea for a user-friendly way for functions to pull values from an arbitrary environment. Let me introduce the concept with a motivational example before I dig into dirty syntax and semantics. Note that I intentionally removed many pieces of code from my examples, but I guess everybody will be able to understand the context.
Say you are writing a visitor (with the pattern of the same name) for an AST to implement an interpreter:
class Interpreter: Visitor {
func visit(_ node: BinExpr) { /* ... */ }
func visit(_ node: Literal) { /* ... */ }
func visit(_ node: Scope) { /* ... */ }
func visit(_ node: Identifier) { /* ... */ }
}
Although this design pattern is often recommended for AST processing, managing data as we go down the tree can be cumbersome. The problem is that we need to store all intermediate results as we climb up the tree in some instance member, because we can’t use the return type of the visit(_:) method, as we would do with a recursive function:
class Interpreter: Visitor {
func visit(_ node: BinExpr) {
node.lhs.accept(self)
let lhs = accumulator!
node.rhs.accept(self)
let rhs = accumulator!
/* ... */
}
func visit(_ node: Literal) { /* ... */ }
func visit(_ node: Scope) { /* ... */ }
func visit(_ node: Identifier) { /* ... */ }
var accumulator: Int? = nil
/* ... */
}
As our interpreter will grow and need more visitors to “return” a value, we’ll be forced to add more and more stored properties to its definition. Besides, the state of those properties is difficult to debug, as it can quickly become unclear what depth of the tree they should be associated to. In fact, it is as if all these properties acted as global variables.
The problem gets even bigger when we need to pass variables to a particular execution of a visit(_:). Not only do we need to add a stored property to represent each “argument”, but we also have to store them in stacks so that a nested calls to a particular visit can get their own “evaluation context”. Consider for instance the implementation of the visit(_ node: Identifier), assuming that the language our AST represents would support lexical scoping.
class Interpreter: Visitor {
/* ... */
func visit(_ node: Scope) {
symbols.append([:])
for child in node.children {
child.accept(self)
}
symbols.removeLast()
}
func visit(_ node: Identifier) {
accumulator = symbols.last![node.name]!
}
var symbols = [[String: Int]]()
}
We could instead create another instance of our visitor to set manage those evaluation contexts. But that would require us to explicitly copy all the variables associated to those contexts, which could potentially be inefficient and error prone.
In fact, this last point is also true when dealing with recursive functions. For instance, our visit(_ node: Identifier) method could be rewritten as:
func interpret(_ identifier: Identifier, symbols: [String: Value]) -> Int { /* ... */ }
so that its evaluation context is passed as a parameter. But this also requires all other functions to also pass this argument, even if their execution does not require the parameter.
func interpret(_ binExpr: BinExpr, symbols: [String: Value]) -> Int {
let lhs = interpret(node.lhs.accept, symbols: symbols)
/* ... */
}
This technique consisting of passing parameters through a function just so that another function called deeper in the stack can get its variable is actually quite common. Sadly, it clouds all signatures with many parameters, which make it more difficult to reason about what a particular function actually needs from its caller. Note also that this overuses the running stack, putting many unnecessary values in all execution frames.
The idea I’d like to pitch is to offer a mechanism to address this issue. Namely, I’d like a way to provide a function with an environment when using its parameter and/or return type is not an option, or when doing so would add unnecessary complexity to its signature (like illustrated above). While this mechanism would share similarities with how functions (and closures) are able to capture variables when they are declared, it would differ in the fact that these environment would depend on the execution frame prior to that of a particular function call rather than the function declaration/definition.
First, one would declare a contextual variable:
context var symbols: [String: Int]?
Such contextual variables could be seen as stacks, whose values are typed with that of the variable. In that particular example, the type of the context symbols would be [String: Int]. The optional is needed to explicitly represent the fact that a context may not always be set, but this could be inferred as well. One would be able to set the value a contextual variable, effectively pushing a value on the stack it represent, before entering a new execution frame:
set symbols = [:] in {
for child in node.children {
child.accept(self)
}
}
In the above example, the contextual variable symbols would represent an empty dictionary for all execution frames above that of the context scope (delimited by braces). Extracting a value from a context would boils down to reading an optional value:
guard let val = symbols?[node.name] else {
fatalError("undefined symbol: \(node.name)")
}
accumulator = val
And as contextual variables would actually be stacks, one could push another value on the top of them to setup for another evaluation context. Hence, would we call set symbols = [:] in { /* ... */ } again, the contextual variable symbols would represent another empty dictionary as long as our new context would be alive:
set symbols = ["foo": 1] in {
set symbols = ["foo": 2] in {
print(symbols!["foo”]!)
// Prints 2
}
print(symbols!["foo”]!)
// Prints 1
}
The advantage of that approach is threefold.
1. It lets us provide an environment to functions that can’t receive more parameters or return custom values. This is particularly useful when dealing with libraries that provide an entry to define custom behaviour, but fix the API of the functions they expect (e.g. a visitor protocol). In those instances, capture by closure is not always possible/desirable.
2. In large function hierarchy, it lets us provide deep functions with variables, without the need to pass them in every single function call just in the off chance one function may need it deeper in the call graph.
3. It better defines the notion of stacked environment, so that one can “override” an execution context, which is often desirable when processing recursive structures such as trees or graphs. In particular, it is very useful when not all functions require all data that are passed down the tree.
Using our contextual variables, one could rewrite our motivational example as follows:
class Interpreter: Visitor {
func visit(_ node: BinExpr) {
let lhs, rhs : Int
set accumulator = nil in {
node.lhs.accept(self)
lhs = accumulator!
}
set accumulator = nil in {
node.lhs.accept(self)
rhs = accumulator!
}
switch node.op {
case "+":
accumulator = lhs + rhs
case "-":
accumulator = lhs - rhs
default:
fatalError("unexpected operator \(node.op)")
}
}
func visit(_ node: Literal) {
accumulator = node.val
}
func visit(_ node: Scope) {
set symbols = [:] in {
for child in node.children {
child.accept(self)
}
}
}
func visit(_ node: Identifier) {
guard let val = symbols?[node.name] else {
fatalError("undefined symbol: \(node.name)")
}
accumulator = val
}
context var accumulator: Int?
context var symbols: [String: Int]?
}
It is no longer unclear what depth of the tree the accumulator variable should be associated with. The mechanism is handled automatically, preventing the programmer from incorrectly reading a value that was previously set for another descent. It is no longer needed to manually handle the stack management of the symbols variable, which was error prone in our previous implementation.
The scope of contextual variables should not be limited to type declarations. One may want to declare them in the global scope of a module, so that they would be part of the API of a library. Imagine for instance a web framework library, using contextual variables to provide the context of a request handler:
// In the framework ...
public context var authInfo: AuthInfo
// In the user code ...
framework.addHandler(for: URL("/index")) {
guard let user = authInfo?.user else {
return Redirect(to: URL("/login"))
}
return Response("Welcome back \(user.name)!")
}
In that example, one could imagine that the framework would set the contextual authInfo variable with the authentication information it would parse from the request before calling the registered handlers.
This idea is not exactly new. In fact, people familiar with Python may recognise some similarities with how "with statements" work. Hence, it is not surprising that things one is able to do with Python’s contexts would be possible to do with contextual variables as presented above. Consider for instance the following class:
class Connexion {
init(to: URL) { /* ... */ }
deinit {
self.disconnect()
}
func disconnect() { /* ... */ }
}
Thanks to Swift’s memory lifecycle, instantiating an instance of Connexion as a contextual variable would automatically call its destructor when the context would get popped out.
context var conn: Connexion
set conn = Connexion(to: URL("http://some.url.com")) in {
/* ... */
} // the first connection is disconnected here
I see many other applications for such contextual variables, but I think this email is long enough.
I’m looking forward to your thought and feedbacks.
Best regards,
Dimitri Racordon
CUI, Université de Genève
7, route de Drize, CH-1227 Carouge - Switzerland
Phone: +41 22 379 01 24