Idea for enabling DSLs: bind to self in closures

( Originally proposed here: [proposal] bind input parameter to… | Apple Developer Forums )

Often, frameworks will wish to provide a domain specific language for configuring or composition. Web frameworks for instance may define methods like

get(“/“) { request, result in … }

Testing frameworks (such as Quick/Nimble) may wish to define a terse and expressive syntax for defining behavioral tests:

describe("the 'Documentation' directory") {
      it("has everything you need to get started") {
            let sections = Directory("Documentation").sections
            expect(sections).to(contain("Organized Tests with Quick Examples and Example Groups”))
      }
}

While expressive, this has a big problem - describe, it, expect and contain are now defined as either global functions (which cause namespace pollution and mandate global state) or instance methods on a class (which requires the use of class inheritance, and is limited by single class inheritance)

You could have some sort of context object passed into the closure instead:

protocol SpecBuilder {
      func describe(description:String, inner:(QuickSpecContext)->())
}

protocol QuickSpecContext {
      func it(description:String, inner:(QuickSpecContext)->())
      func expect<T>(statement:@autoclosure ()->T, file: StaticString = __FILE__, line: UWord = __LINE__ ) -> Expectation<T>
}

var spec = QuickSpecBuilder(config)
spec.describe("the 'Documentation' directory") {
      context in
      context.it("has everything you need to get started") {
            context in
            let sections = Directory("Documentation").sections
            context.expect(sections).to(contain("Organized Tests with Quick Examples and Example Groups”))
      }
}

But this has significantly more noise. So my proposal is to allow for a closure argument to be used as the current type instance- to be able to redefine ‘self’ within a block.

var spec = QuickSpecBuilder(config)
spec.describe("the 'Documentation' directory") {
      self in
      it("has everything you need to get started") {
            self in
            let sections = Directory("Documentation").sections
            expect(sections).to(contain("Organized Tests with Quick Examples and Example Groups”))
      }
}

resolution remains the same (lexical scope shadowing the type), this is merely shorthand to allow expressive grammars without requiring class inheritance or global functions. It also remains optional to use - the last two examples are based around the same protocols and should compile to the same code.

I considered alternate syntaxes to express this, mostly alternatives on the bracketing of the closure itself to indicate binding a parameter to self. In the end, I decided:
1. When a block has multiple parameters, you would still need syntax decide which, if any, is bound to self
2. The language complexity in having another syntax for expressing closures with different behavior may not be worth it
3. Code would be confusing for those not knowing the new syntax. “self in” is (comparatively) straightforward and descriptive

-DW

Another way to do this would be to support scoped imports, to make a set of top-level functions locally available without polluting the global namespace:

{
  import func QuickSpecBuilder.expect

  expect(sections).to(....)
}

Being able to elide self is already somewhat controversial, and a number of people find it makes code harder to read. I worry that allowing closures to change 'self' has the potential to be even more confusing. In Javascript, it's my understanding the ability to arbitrarily rebind 'this' is seen as a design flaw rather than a feature people regularly take advantage of.

-Joe

···

On Dec 3, 2015, at 10:28 PM, David Waite <david@alkaline-solutions.com> wrote:

( Originally proposed here: [proposal] bind input parameter to… | Apple Developer Forums )

Often, frameworks will wish to provide a domain specific language for configuring or composition. Web frameworks for instance may define methods like

get(“/“) { request, result in … }

Testing frameworks (such as Quick/Nimble) may wish to define a terse and expressive syntax for defining behavioral tests:

describe("the 'Documentation' directory") {
      it("has everything you need to get started") {
            let sections = Directory("Documentation").sections
            expect(sections).to(contain("Organized Tests with Quick Examples and Example Groups”))
      }
}

While expressive, this has a big problem - describe, it, expect and contain are now defined as either global functions (which cause namespace pollution and mandate global state) or instance methods on a class (which requires the use of class inheritance, and is limited by single class inheritance)

You could have some sort of context object passed into the closure instead:

protocol SpecBuilder {
      func describe(description:String, inner:(QuickSpecContext)->())
}

protocol QuickSpecContext {
      func it(description:String, inner:(QuickSpecContext)->())
      func expect<T>(statement:@autoclosure ()->T, file: StaticString = __FILE__, line: UWord = __LINE__ ) -> Expectation<T>
}

var spec = QuickSpecBuilder(config)
spec.describe("the 'Documentation' directory") {
      context in
      context.it <http://context.it/&gt;\("has everything you need to get started") {
            context in
            let sections = Directory("Documentation").sections
            context.expect(sections).to(contain("Organized Tests with Quick Examples and Example Groups”))
      }
}

But this has significantly more noise. So my proposal is to allow for a closure argument to be used as the current type instance- to be able to redefine ‘self’ within a block.

var spec = QuickSpecBuilder(config)
spec.describe("the 'Documentation' directory") {
      self in
      it("has everything you need to get started") {
            self in
            let sections = Directory("Documentation").sections
            expect(sections).to(contain("Organized Tests with Quick Examples and Example Groups”))
      }
}

resolution remains the same (lexical scope shadowing the type), this is merely shorthand to allow expressive grammars without requiring class inheritance or global functions. It also remains optional to use - the last two examples are based around the same protocols and should compile to the same code.

I considered alternate syntaxes to express this, mostly alternatives on the bracketing of the closure itself to indicate binding a parameter to self. In the end, I decided:
1. When a block has multiple parameters, you would still need syntax decide which, if any, is bound to self
2. The language complexity in having another syntax for expressing closures with different behavior may not be worth it
3. Code would be confusing for those not knowing the new syntax. “self in” is (comparatively) straightforward and descriptive

Another way to do this would be to support scoped imports, to make a set of top-level functions locally available without polluting the global namespace:

{
  import func QuickSpecBuilder.expect

  expect(sections).to(....)
}

All of the matching methods (such as equal, contains, raiseError, raiseException) are also module functions, so you would likely just pull them all in. The Hamcrest Java project is very similar to Nimble, and they have a utility class generator (static methods calling other class static methods) specifically for reducing the number of static imports you have to put at the top of your test files.

The most common use of DSLs is to give an expressive grammar to builder patterns. The Ruby on Rails router has perhaps had more man hours put into it than any other DSL on the platform, and its primary purpose is to build
  - a dispatch table to match and parse parameters out of incoming URLs, and stuff them into methods on controller instances
  - a method (actually methods) for the reverse behavior of taking parameters and building the appropriate path, e.g. user_path(current_user)

These are often hierarchal via closures. The use of closures both allows specific context to be made available based on the outer configuration, and for the actual use of a function to be deferred until it is needed.

Another example would be a DSL providing inversion of control. This might define methods to resolve objects by name, with the closures defining the construction behavior. The closure would get as a parameter the context of the DSL lookup, to allow it to lazily retrieve its own dependencies. However, this can’t be the global reference to the DSL object, because you want to fail on circular dependencies. Instead, you get a state object representing the particular request which was made of the system.

Yet another example is Quick itself - the ‘it’ and ‘expect’ functions are inheriting the ‘describe’ to aid in better error reporting, as well as any per-test invocation behavior (such as clearing out an in-memory database). This state can be passed in as an object, captured in lexical scope by the nested closure, or (in these cases) set and modified globally as needed. My concern is in people using global state to get the expressiveness they desire for their APIs.

With Nimble in particular, expect has state which currently (because it is a global function) has to be globally defined. This state also can’t be thread local, as your actual tests might involve multiple threads or callbacks.

Being able to elide self is already somewhat controversial, and a number of people find it makes code harder to read. I worry that allowing closures to change 'self' has the potential to be even more confusing. In Javascript, it's my understanding the ability to arbitrarily rebind 'this' is seen as a design flaw rather than a feature people regularly take advantage of.

Ruby has it as well, and while it perhaps a bit more ‘magic’ than some would desire, you will not find Ruby developers lamenting its existence. Javascript’s inconsistent historical behaviors greatly contribute to the perception that rebinding is a design flaw.

However, the behavior I’m describing is different than either Ruby or Javascript:
- both Ruby and Javascript are late bound, which allows the JIT implementations to adapt to whatever type instance is given. Swift is not a late bound language, so it must know the types it is working with beforehand
- Ruby and Javascript allow the caller to set self/this. The choice to consider one of the parameters to have ‘self’ behavior in my proposal is entirely in the hands of the person writing the block of code being called, by choosing to name one of the passed in parameters ‘self'
- Due to late binding, it may be very difficult to determine what type your self/this is bound to in Ruby/Javascript. This is explicit in the signature of the closure in my proposal.
- Ruby and Javascript have no visual indicator that a block of code may be called with some other object as self/this. This is explicit in my proposal by naming one of the parameters in your closure ‘self'
- Finally, as a side-effect you can choose to bind a parameter to self independent of the API designer if you feel that makes your code more readable. Admittedly, I removed my example as the code was simplistic enough that $0 was more concise

That said, as proposed this is an expressiveness feature. I proposed it because I found I was uncomfortable with the number of global functions and amount of global state I was seeing in Swift modules - but could not come up with a way to make said code more robust/safe without negatively affecting expressiveness.

-DW

···

On Dec 4, 2015, at 10:36 AM, Joe Groff <jgroff@apple.com> wrote:

I like this idea. I'm very much against rebinding `self` because it
seems like an excellent source for confusion. Not only that, but the
actual underlying desire here isn't to remove `self` at all, but just to
introduce new functions into function resolution within a scope. And
this is precisely what adding imports in arbitrary scopes does (the only
downside being you need a line of code to add them, but that's not a big
deal). I know Rust allows this and it's pretty handy. I'd love to have
this feature even when not using a DSL.

-Kevin Ballard

···

On Fri, Dec 4, 2015, at 09:36 AM, Joe Groff wrote:

Another way to do this would be to support scoped imports, to make a set of top-level functions locally available without polluting the global namespace:

{ import func QuickSpecBuilder.expect

expect(sections).to(....) }

Being able to elide self is already somewhat controversial, and a
number of people find it makes code harder to read. I worry that
allowing closures to change 'self' has the potential to be even more
confusing. In Javascript, it's my understanding the ability to
arbitrarily rebind 'this' is seen as a design flaw rather than a
feature people regularly take advantage of.

A few thoughts:

1. In a lot of situations they are not pure functions - they have state associated across them determined by the context in which your closure was called. So the import would not be of a static function, but of an input parameter, aka:
  it(“…”) {
    builder in
    import builder
    expect(sections).to{…}
         }

Assuming expect is the only function, this may very well be equivalent to

  it(“…”) {
    builder in
    let expect = builder.expect
    expect(sections).to{…}
         }

2. expect, in the case of Nimble, is an overloaded function. I assume import would bring in all overloads?
3. I like the idea of providing additional scope rather than overriding self, as you would likely need to bind self to a new name per my proposal (aka [my = self] self in…)
4. I like the idea of this having lexical scope, and import being usable outside of closures
5. Imports likely should generate conflicts at compile-time if they shadow defined functions if you can do wildcard imports. No need to have syntax to alias names - one should either change the code to not conflict or use the longer-form names
6. import could be an attribute:
  it(“…”) {
    @import builder in
    expect(sections).to{…}
         }

-DW

···

On Dec 4, 2015, at 3:51 PM, Kevin Ballard <kevin@sb.org> wrote:

On Fri, Dec 4, 2015, at 09:36 AM, Joe Groff wrote:

Another way to do this would be to support scoped imports, to make a set of top-level functions locally available without polluting the global namespace:

{
  import func QuickSpecBuilder.expect

  expect(sections).to(....)
}

Being able to elide self is already somewhat controversial, and a number of people find it makes code harder to read. I worry that allowing closures to change 'self' has the potential to be even more confusing. In Javascript, it's my understanding the ability to arbitrarily rebind 'this' is seen as a design flaw rather than a feature people regularly take advantage of.

I like this idea. I'm very much against rebinding `self` because it seems like an excellent source for confusion. Not only that, but the actual underlying desire here isn't to remove `self` at all, but just to introduce new functions into function resolution within a scope. And this is precisely what adding imports in arbitrary scopes does (the only downside being you need a line of code to add them, but that's not a big deal). I know Rust allows this and it's pretty handy. I'd love to have this feature even when not using a DSL.

-Kevin Ballard

All of the matching methods (such as equal, contains, raiseError,
raiseException) are also module functions, so you would likely just pull
them all in. The Hamcrest Java project is very similar to Nimble, and they
have a utility class generator (static methods calling other class static
methods) specifically for reducing the number of static imports you have to
put at the top of your test files.

They don't need to be, though. Protocol extensions offer an improved way of
handling this. I put together a proof-of-concept PR a short while ago:

Ruby has it as well, and while it perhaps a bit more ‘magic’ than some
would desire, you will not find Ruby developers lamenting its existence.
Javascript’s inconsistent historical behaviors greatly contribute to the
perception that rebinding is a design flaw.

The Ruby community in the past has been somewhat divided on the use of
instance_eval/exec vs. yielding block arguments for clarity, though there
are many popular DSL-like libraries that take advantage of
instance_eval/exec and are expressive and easy to read at the cost of
opaqueness.

However, the behavior I’m describing is different than either Ruby or
Javascript:
- both Ruby and Javascript are late bound, which allows the JIT
implementations to adapt to whatever type instance is given. Swift is not a
late bound language, so it must know the types it is working with beforehand
...

Perhaps a bracketed annotation could assign an unbound type to a closure:

    let builder: [MyType] () -> Void = { ... }
    builder.bind(instance)() // or builder[instance](), or some other syntax

I agree that such a mechanism would allow for very expressive, type-safe
APIs. IDEs could take the guesswork out of "what is 'self'?". At the same
time, I understand apprehensions and aversions to both the complexity and
ambiguities associated with such a feature, and they may go against Swift's
principles at the moment.

···

On Fri, Dec 4, 2015 at 4:44 PM, David Waite <david@alkaline-solutions.com> wrote:

A few thoughts:

1. In a lot of situations they are not pure functions - they have state
associated across them determined by the context in which your closure
was called. So the import would not be of a static function, but of an
input parameter, aka:
  it(“…”) {
    builder in
    import builder
    expect(sections).to{…}
        }

Assuming expect is the only function, this may very well be equivalent to

  it(“…”) {
    builder in
    let expect = builder.expect
    expect(sections).to{…}
        }

I don't think we want to add `import builder`, importing methods that
are implicitly bound to some value seems like a dangerous can of worms
to open up.

I’m not sure about that; isn’t that exactly what ‘self’ is? If anything, it is deciding whether it is worth having two cans of worms open. And the semantics would likely be similar to self - it has to be a fixed value type or reference through the scope.

Import might be a poor overloading of an existing concept though. Alternative syntax based on the setup closure thread’s ‘with’ syntax examples:

  it(“…”) {
    builder in
    with builder {
      expect(sections).to{…}
    }
        }

but obviously I would like to not have the extra level of nesting, so probably more like:
  it(“…”) {
    with builder in
    expect(sections).to{…}
        }

Using .. syntax likely would be inappropriate for this example, since there may be additional business logic in between calls to expect.

If you need state (and don't want to encode that state as
thread-local variables, though of course Swift doesn't currently have
support for those outside of NSThread.threadDictionary), then personally
I don't think it's a big deal to require the explicit state argument.
You could even adopt a convention of using $ for the identifier here
(while $0, $1, etc are defined by the language, $ appears to be open for
use as an identifier), so that would look like

describe("foo") { $ in
   $.it("has bar") { $ in
       $.expect(sections).to(...)
   }
}

This is probably the simplest alternative to my proposal, and requires no language changes. However,
- IMHO the $0, $1, etc syntax is meant for when terseness is a benefit that outweighs readability. This is primarily because $ looks more like an operator than part of a parameter name, and the names themselves aren’t based on the signature of the method calling the closure. Even after a fair amount of swift work, I stumble whenever I see $0, etc syntax. For this reason, using $ as the parameter name feels like it counteracts the expressiveness I was going for.
- Coming up with an alternative explicit name (builder? context?) for a passed parameter is hard, because the code is often not so much manipulating that state as it is operating within the context of that state. The term the language gives us for this is ‘self’, but that isn’t assignable/overridable. This could actually wind up making $ feel more like a keyword than an arbitrary parameter name choice.

And you can also do things like make the expectations actually be static
members of their return value, so you'd have code like

   $.expect(sections).to(.contain(bar))

I hadn’t considered that - your Swift-fu is strong :-)

In the future, if Swift ever gains a fully-fledged macro system, then
maybe you'd be able to rewrite these scopes-with-state as macros that
carry implicit state. Or, heck, maybe someday we'll have higher-order
types AND a monadic system (and either custom syntax or macros to
simulate Haskell's `do` notation) and then we can just use monads to
represent the state :D

One could hope - although I don’t believe macro systems or monads/monoids as a concept (rather than an API influence like Optional) are conducive for learning languages.

5. Imports likely should generate conflicts at compile-time if they
shadow defined functions if you can do wildcard imports. No need to have
syntax to alias names - one should either change the code to not conflict
or use the longer-form names

I agree that imports should throw an error if they'd shadow something
defined locally (though perhaps they can still shadow things from other
imports?). But I would actually like an alias syntax so you can import
something under a different name, even if it's just restricted to naming
modules in some kind of `import qualified` syntax (that would require
the module name to use any member), e.g. `import qualified
LongModuleName as L; /* ... */ L.foo(bar)`.

I was thinking more that "import LongModuleName.foo as bar” perhaps would be abused.

6. import could be an attribute:
  it(“…”) {
    @import builder in
    expect(sections).to{…}
        }

What would that actually be an attribute of? Attributes aren't distinct
items, they modify something else. And why make this an attribute when
we have a perfectly good `import` keyword already?

Yeah, drop that idea.

-DW

···

On Dec 4, 2015, at 4:49 PM, Kevin Ballard <kevin@sb.org> wrote:
On Fri, Dec 4, 2015, at 03:32 PM, David Waite wrote: