[Pitch #2] Function builders

I can certainly understand the desire to have a simpler way to declare new function builders, particularly simple ones only aggregate values and don't need to propagate types the way more advanced builders like SwiftUI's ViewBuilder do. However, I don't think that it makes sense to approach this as two different APIs from the language perspective. Rather, we should try to make the feature we have composable enough to enable the simple interface. I think we're almost there (more on that in a moment), but first I want to disagree with this bit:

Philosophically, function builders are intended to be more declarative in nature, similar to what fits well if the entire closure were a single expression. Arbitrary control-flow (like break, continue, etc.) goes against that philosophy. There us a short paragraph about this in future directions, but I sorta feel like it's not a good direction. And I certainly don't think we should do it for "high-level" function builders and not "low-level" ones.

Back to making the simple case simple. @anreitersimon showed an example that introduces a FunctionBuilderProtocol that opts in to all of the syntax that a function builder can support, wrapping it up in default implementations so a client can define a new, fully-featured function builder by only implementing buildFinalResult. I think your "Builder" example can be layered on top of that fairly easily.

Note that @anreitersimon's example doesn't work today because of an oversight in the way we did name lookup for the build functions. I just put up a bug fix to the function builders implementation that makes it work. I've also added this example to the proposal so others can find it more easily.

Perhaps some form of that protocol belongs in the standard library. That would address one of @davedelong's concerns as well:

@JoeyKL notes that having such a facility would cover a lot of use cases:

That we can define this as our own protocol on top of the existing proposal indicates that the language itself doesn't need another "simpler" mode. Rather, we can point people at this protocol (or something like it) as the way to eliminate boilerplate for a class of use cases.

Doug

[EDIT: Added link to the new discussion in the proposal of a "simple" function builder protocol, and response to @JoeyKL's comment that came in a moment after my original post.]

6 Likes

I was thinking about why the current design makes me somewhat uncomfortable, and I think I've partially figured it out. Of 28 uses of @_functionbuilder in projects linked in the awesome-function-builders repository:

  • 14, half, only implement buildBlock and all to a common return type.

  • 9 implement methods other than buildBlock, but all to a common return type, and only to implement a simple monoid operation (or in 1 case a semigroup operation).

  • 5 implement methods with differing return types / special behavior different methods. (One I think unnecessarily; three for in UI libraries (two of which are modeled after SwiftUI specifically); one for GraphQL queries).

I think the most obvious takeaway for this is that the function builder API should expose a way to more easily implement function builders for the most common use case, building static data structures under a monoid operation. This could be accomplished by either by defaulting to buildArray when no buildBlock is specified or by separating it into one implementation explicitly for loops and one for a single-method way to define a full builder that accepts control flow.

1 Like

It's a reasonable request. We could instantiate a builder instance at the beginning of the body, then call instance methods on it rather than static methods. The biggest semantic question is how we should initialize the builder. Do we call init() always, or do we let you specify what arguments to pass to the initializer, e.g., @MyBuilder("something") () -> T? What kinds of use cases does this enable that cannot be handled by the existing design?

Doug

5 Likes

I guess that's a good argument for adding such a protocol to the standard library, if it's the way the feature is usually used.

I've been lumping this idea in with the general notion of "virtualized ASTs", where the builder isn't really collecting values---it's describing the control flow and providing closures to compute some information, and relying on the function builder to decide when/how to execute those closures.

I'm not necessarily opposed to the overall idea, but I don't think we should go at it piecemeal by adding support for only for..in loops. How would we virtualize an if statement, or a switch? Do we virtualize each expression statement, and how does that differ from what one does today with buildExpression(_: @autoclosure(escaping) -> T)?

To me, that feels like an entire layer on top of this already-large proposal, and one that has not been explored deeply enough yet.

Doug

I was thinking in the line of property wrappers where the parens invoke a matching initialiser and omitting the parens invokes the “sensible initialiser”. For function builders specifically, @mybuilder would be shorthand for @mybuilder(), and a diagnostic would be reported if no zero-parameter initialiser is declared or accessible. The shorthand also makes all existing function builder uses (SwiftUI, etc.) source-compatible at the call-site.

I haven’t pondered about exact uses yet, but I presume there could be DSLs where the builder is configured for the specific container it’s being used for. I’ll try a (probably contrived) example:

struct Division<Content : Element> : Element {
    init(@ElementBuilder(containerStyle: .block) content: () -> Content) { … }
    // …
}

struct Heading<Content : Element> : Element {
    init(@ElementBuilder(containerStyle: .inline) content: () -> Content) { … }
    // …
}
1 Like

I'm quite on edge for introducing that much configurability to the builder, especially that it seems to be a marginal gain attached to a free footgun. Scenarios like this come up relatively often:

@ViewBuilder1 func foo() -> some View { ... }
func bar<T: View>(_: @ViewBuilder2 () -> T) {}

bar(foo)

Now the foo passed into the bar would use ViewBuilder1, but users can reasonably expect it to use ViewBuilder2. It's already possible with the current design, but I'm afraid it'd easily be amplified when adding even more arguments to the builders.

Given the current model that built the closure into a normal closure returning the Result type, it seems to work better when there are only minimal configuration for each kind of builder.


I'm somewhat confused about the meaning of this paragraph about expression statement:

If the function builder type declares the buildExpression function-building method, the transformation calls it, passing the expression-statement as a single unlabeled argument. This call expression is hereafter used as the expression statement. This call is type-checked together with the statement-expression and may influence its type.

Does it mean that the declaration (or the lack thereof) of buildExpression affects the type of the Expression? It seems to contradict the Type inference: function builder bodies section later.

@Douglas_Gregor could you clarify the intended behaviour? I'd expect that the expression statement is also type checked separatedly from the buildExpression (and buildBlock), but I'm not sure from the text above.

That's right. The expression is generally checked independently of the calls to the builder methods, and buildExpression is specifically an exception to that rule.

1 Like

Yeah, something like this could fit. I have a couple of questions about the intended behavior here.

As I understand it, the goal is to effectively leave the declarations unchanged, but allow the declaration to produce a "partial result" (per the proposal's terminology) as well. So, your schema example would be translated into something like this:

let thingSchema = Schema(id: "thing")
let v0 = Builder.buildDeclaration(thingSchema)
let otherSchema = Schema(id: "other")
let v1 = Builder.buildDeclaration(otherSchema)

let v2 = Builder.buildExpression(Table(schemas: thingSchema, otherSchema) { ... })
let v3 = Builder.buildExpression(Table(schema: thingSchema) { ... })
let v4 = Builder.buildExpression(Table(schema: otherSchema) { ... })
return Builder.buildBlock(v0, v1, v2, v3, v4)

Does that look correct?

So, my questions:

  1. If multiple variables are bound by the declaration, should there be multiple buildDeclaration calls (one per variable, with the value actually bound to that variable) or should there be a single buildDeclaration call with the type of the initializer. E.g., given

    let (a, b) = buildTwoTables()
    

    is that going to produce

    let v0 = Builder.buildDeclaration(a)
    let v1 = Builder.buildDeclaration(b)
    

    or

    let v1 = Builder.buildDeclaration((a, b))
    

    ?

  2. Should this feature allow one to affect the type of the declaration? For example, if I have

    let a = getIntValue()
    

    would it be possible to buildDeclaration to make the type of a something other than Int, e.g., RefCell<Int>?

  3. Should this feature make it possible to reason about the identity of the declarations? I could certainly imagine a DSL where one wants to be able to define variables and then reason about them as entities, rather than as the values they produce. For example, if we had something like this:

    let a = getSomeValue()
    let b = getSomeValue()
    a >= 2 * b // form a relationship between the two variables              
    

    We might want to work with a and b via key-paths, or some other mechanism for identity. I'm not sure how to do this transformation, short of having the function builder be able to wrap up a and b in a property wrapper that it gets to initialize.

Doug

3 Likes

This is roughly in line with what property wrappers do.

In my mind, this extension to non-static methods is something that we could fit in, but we should have a couple of solid use cases before we consider adding it. In part, I'm wondering if we'll need more out of it---say, the ability to reference self if we're in a method---before it can actually satisfy enough new use cases. It has the advantage of being a pure extension that we could add whenever, so if we don't do it now, but the use cases keep coming, we can add it later.

For now, I've added it to "Future Directions" as a potential extension so it doesn't get lost, and we can ruminate on use cases.

Doug

[EDIT: Add link to the new section added to future directions]

2 Likes

What's the reason for this behaviour? My first grab was to have the Expression be independent of buildExpression as well since it can be hard to read if the type of Expression depends on the builder implementations, including buildExpression.


A convex optimizer is most definitely one of the use cases for this. It's common there to create variables and reason about their relationship. I can see it using the context variable that takes care of such things, though:

solve { context in
  let a = context.newVariable()
  let b = context.newVariable()
  a >= 2 * b

  Minimize(a + b)
}

Maybe it's possible to do something like this:

solve {
  let a = Double.self
  let b = Double.self
  a >= 2 * b

  Minimize(a + b)
}

We likely want also to have the ability to opt-in to/out of this feature.


I think you can already pass most data to the function as the arguments. So the only real difference would be the ability to change the resulting hierarchy based on the builder initialization.

Thanks a lot Doug for having a look!

Yes, correct. Keep them as-is, but allow them to produce a partial result is exactly right.

Yes, this looks perfect :+1:


I would say (1), here's some thoughts why:

// You'll probably know better than myself what these things really de-sugar to but based on just user experience (and Scala, which has the same syntax, I suppose, so caveat here that I may be used to something from that language still -- where val (a, b) is exactly just syntax sugar for destructuring a tuple and declaring a and b):

It would feel quite normal for this to be equivalent to two bindings; I.e. let v0, let v1. This just "happens to be a pattern that declares two values" in my mind -- not sure if that's how Swift actually internally represents this and in what phase though :thinking: (I checked language reference guide but it does not really go into detail if this is sugar or "special").

I'm trying to imagine why one would want the (2) option... but I don't really see it. The reason you have let (a, b) is because you want to use a and b, not because you somehow get the entire (a, b) (in normal Swift) so a function builder is kind of the same right?

If one really wants to have functions which yield "my special has a few fields thing" seems that's better served with a small struct rather then the pattern let / pseudo tuple?

I currently don't see a need for changing the produced type.

One could imagine use cases etc., but this seems like if some use case pops up and really needs it it could be potentially added? If one has some DSL where one would need "only Repr<T> can be here" it's not too bad to make functions work with Repr<T>s... But again, I could try hard to imagine some use case, but I don't think there's an obvious one (this reminded me of "Lifted" SQL representations and things like that where one yields Repr when a query should return a T etc... but again, not something I've thought through and does not seem like it would have to be added right away?)

That's quite interesting... and I guess thats why the question about changing the remitted type :thinking:

At least in the DSLs and "declarative" "configuration" types of things I don't see a need for this.

This and the previous idea seem like they may be useful to some declarative efficient maths / querying specialized libraries...? I don't have a good idea how those would work nor the background on those though.

The prime use case is really configuration languages and some "markup languages" where I want to refer to "that thing" where "that thing" has some identifier (usually some .id: String, but I'd want to pass it typesafely -- so the Table needs a Schema to be passed in, that's about it).

Thanks for looking into these!

Overall, the simplest version of this would be sufficient to make configuration langs happy / expressible nicely using function builders I feel (judging by my PoC I did of such declaration based mini language).

Ah, to also mention a weird case myself what this might lead to:

let a = Schema() {} 
a // whoops!

so technically this could then end up producing the schema partial result "twice"...

In reality I don't think this is a problem though, the function builder is aware what it's getting after all, and if it wants to avoid this it can always de-dup or "replace the previous one with the same ID with the latest one" or whichever it decides... So I don't think this weird case is a deal breaker, it seems fine to me.

Note also, not all function builders will want to use the buildDeclaration (!), so only the ones which do can get into this trouble.

There's also this pattern to deal with:

let (a, b) = (1, "foo")

Some builders can probably prevent this at compile time by not implementing the relevant buildExpression.

let a = ...
a // <-- Error, no matching `buildEexpression`

In what way do you suggest this is very different?

I'd still desugar and handle the same as "declare a with 1", "declare b with "foo"", or is there something I'm missing?

I see, interesting, that's quite great :+1:

So some types can be "please declare me", makes a lot of sense... like a "schema" etc.

It's just not clear to me what the intended behaviour is. It could also be declaring the tuple, and use each part of it via a and b.

Of course; especially when taking into consideration how ‘break’ and ‘continue’ would interact with a ‘buildForIn’ method (as @Lantua pointed out). Things would get complicated really quickly. This is a proposal on its own - as you pointed out - and FYI I think it should be included as a Future Direction.

Maybe a more static alternative that allows for more control of the input data would be adequately powerful for features:

static func buildArray<Element>(
    _ componentsAndElements: [(Element, Component)]
) -> Result

In the above there would be a stronger association between the Element for which the given Component was yielded. Obviously, this is just a draft and a totally different proposal as well.

Overall, I’ve had some time to play with Function Builders and I have to say they are a wonderful feature that really helps Swift - and its plethora of libraries stand out.

Yes, that's certainly a way to look at it. The downside is, that the syntax that is available in builder blocks is limited to the syntax that function builders currently support. But short term, this is probably the way to go.

I have created a demo implementation of how I think this could work:
https://github.com/lassejansen/EventedBuilder

This is the point at which one can have the function builder itself influence how expressions are type-checked. This post shows how buildExpression is used to address a use case that's otherwise not possible with function builders.

Doug

1 Like

Thanks Doug for your detailed reply!

Do you mean by this that function builders as proposed here should be seen as generating a single expression or that a Swift feature in general that enables the creation of DSLs should resolve to a single expression?

I'm not sure I agree here. From the perspective of a DSL consumer I'm just writing Swift. While I know in the back of my head that the HTML DSL implementation uses function builders, I don't think of it as creating one big expression while writing code. My intuition is more like "If I put a free expression here it will go into the output stream".

Missing break and continue would be no deal breaker though, in the end they can always be replaced by another nested if. The more interesting question to me is if things like throw and await (if it comes) would be supported. If we stick with the HTML example: An approach I've used a few times in the past with Ruby on Rails is "Russian Doll Caching", i.e. parts of an HTML page are cached in e.g. Redis and the remaining parts are rendered dynamically. It would probalby look something like this in Swift:

try await body {
  try await div {
    try await cache( key: "static content") {
      p {
        "Some static content"
      }
    }
    div {
      "Some dynamic content"
    }
  }
}

Would that be possible with function builders?

While this example does remove some complexity, it's still more complicated than what I had in mind, and it also supports only one type, if I see it correctly.

Sticking to the HTML example from the proposal, my latest iteration of how I would like to use a DSL building feature is this:

protocol HTMLNode {}

struct HTMLElement: HTMLNode {

  // We want to pick up free expressions of 2 types: HTMLElement and String
  typealias Builder = EventedBuilder2<HTMLElement, String>

  let name: String
  var childNodes: [HTMLNode] = []

  init(name: String, @Builder _ content: () -> () = { }) {
    self.name = name
    Builder.evaluate( content) { element in
      childNodes.append( element)                // Handle HTMLElement
    } on: { text in
      childNodes.append( TextNode(text: text))   // Handle String
    }
  }
}

struct TextNode: HTMLNode {
  let text: String
}

The full implementation is available here:
https://github.com/lassejansen/ExpressionBuilder

Outdated

It's quite hacky though, using thread-local storage and withoutActuallyEscaping to pass the buildExpression calls to the closures:
https://github.com/lassejansen/EventedBuilder/blob/master/Sources/EventedBuilder2.swift

Do you think it would make sense to make an API like this part of the standard library? Or at least provide a way to enable an evented API like this without using thread-local storage? I think changing the static build... functions to be instance methods could work, together with allowing to create the function builder types yourself.