[Pitch #2] Function builders

Out of curiosity, why not:

@functionBuilder
struct HTMLBuilder {

  static func buildExpression( _ node: HTMLNode) -> [HTMLNode] {
    return node
  }

  static func buildExpression( _ text: String) -> [HTMLNode] {
    return [Node(text: text)]
  }

  static func buildFinalResult(_ expressions: [HTMLNode]) -> [HTMLNode] {
    return expressions
  }
}

This immediately raises the question of why buildFinalResult should be required (it isn’t in the current proposal).

I have it from here: https://github.com/DougGregor/swift-evolution/blob/function-builders/proposals/XXXX-function-builders.md#function-building-methods

buildFinalResult(_ components: Component) -> Return is used to finalize the result produced by the outermost buildBlock call for top-level function bodies. It is only necessary if the DSL wants to distinguish Component types from Return types, e.g. if it wants builders to internally traffic in some type that it doesn't really want to expose to clients. If it isn't declared, the result of the outermost buildBlock will be used instead.

Isn't that the right one?

Yes, right, that would be even simpler.

Maybe I misread your question. If it was "Why don't we require at least one implementation of buildExpression, instead of requiring buildFinalResult" – then yes, you are right, that would probably the most simple API of all.

The revised simple HTMLNode example would look like this then:

@functionBuilder
struct HTMLBuilder {

  static func buildExpression( _ node: HTMLNode) -> [HTMLNode] {
    return [node]
  }
}

HTMLNode + String:

@functionBuilder
struct HTMLBuilder {

  static func buildExpression( _ node: HTMLNode) -> [HTMLNode] {
    return [node]
  }

  static func buildExpression( _ text: String) -> [HTMLNode] {
    return [TextNode(text: text)]
  }
}

This is, in fact, the motivation. I'll update the proposal accordingly.

Thinking about it further, I agree with you. buildDo is really quite out of step with the rest of the builder functions---it's not needed to aggregate values (buildBlock does that), and it's specific to one rarely-used statement kind. I'll remove it outright.

I can see this as an implementation fallback---if you've implemented the more-general buildEither(first:) and buildEither(second:) but not buildOptional(_:), we can do the above transformation. That makes function builders slightly more convenient to implement. I don't like striking buildOptional entirely, because I don't want to push more type-based overloading into the builders like you mention here:

buildArray enables the for..in loop. While you could use it for selection statements (if/else/switch) as a fallback---heck, you could use it instead of buildBlock as a fallback---it feels wrong to conflate these. buildEither is about selection, buildArray is about looping, buildBlock is the basic aggregation.

If we're trying to minimize the effort needed for a non-type-preserving function builder, I think the "simple" function builder protocol is a better approach than fine-tuning the fallback mechanisms for the language proposal.

It feels like a refactoring action: show the translation of a transformed function into one that has explicit calls into the function builder and an explicit return with the result. I agree that this would help a lot.

That the playground transform happens up at the language level, rather than in a lower-level pass, has been a significant source of issues. If we were to change the playground transform, it wouldn't be toward integrating more in the language---it would be to move it later in the compilation pipeline, after semantic analysis, so we could be 100% sure it didn't affect the semantics of the program. Perhaps we can draw some inspiration from the idea of the playground transform, but function builders won't be a replacement for it, no matter how expressive they become.

Doug

5 Likes

The idea behind buildDo was that it might combine well with attributes to allow site-specific customization of building. But we haven't added that, and there's no real reason to vs. just using some sort of DSL-specific combinator. So I'm fine with just dropping it.

I am still convinced that the buildBlock functions that take a fixed amount of parameters are a bad design choice for the following reasons:

  1. Without variadic generics, these are either limited to homogeneous types or a fixed number of arguments. For example in SwiftUI, every function builder closure can have 10 subviews at max. In some projects of mine, I have found this limitation to be annoying, especially when building things other than views (e.g. a Layer builder for neural network layers)
  2. Related to the first: Without variadic generics, there will be overload hell
  3. Even with variadic generics, their expressiveness is more limited than other approaches.
  4. They can only be type checked in one big block, not at each line in code.

My opinion therefore is that function builders should behave more like a state machine. They should be in a state, consume an expression and produce a new state. This goes beyond the stateful function builder future direction in the proposal. IMO, function builders should be completely based on this approach.

In code, this could look like the following:

@functionBuilder
enum ViewBuilder {
    static func initialState() -> EmptyView {
        EmptyView()
    }
    static func next<State: View, Input: View>(state: State, input: Input) -> TupleView<(State, Input)> {
        ...
    }
}

Methods like buildIf would pre-process inputs, such that they can then be consumed by next.

Then, a function builder closure would work the following way:

VStack {
    Text("Hello")
    Text("World")
}
VStack {
    let s0 = ViewBuilder.initialState()
    let s1 = ViewBuilder.next(state: s0, input: ViewBuilder.buildExpression(Text("Hello")))
    let s2 = ViewBuilder.next(state: s1, input: ViewBuilder.buildExpression(Text("World")))
    return s2
}

This would make function builders much more flexible in the following ways:

  • The 10 view limit will be gone.
  • Type checking happens at each step when an input is consumed.
  • With the state machine approach, function builders can express more complex constraints. For example, generic constraints could be added such that a function builder parses a regular or context free language by modelling a pushdown automaton:
protocol ParenthesisState {}
struct EmptyState: ParenthesisState {}
struct NestedState<Child: ParenthesisState>: ParenthesisState {}

protocol Symbol {}
struct OpenParenthesis: Symbol {}
struct CloseParenthesis: Symbol {}

@functionBuilder
enum ParenthesesBuilder {
    static func initialState() -> EmptyState
    static func next<State: ParenthesisState>(state: State, input: OpenParenthesis) -> NestedState<State>
    static func next<Substate: ParenthesisState>(state: NestedState<Substate>, input: CloseParenthesis) -> Substate
}

This would then yield the following:

// return type specifies that PDA accepts with empty stack
@ParenthesisBuilder var expression: EmptyState {
    // let s0 = ParenthesisBuilder.initialState()
    OpenParenthesis() 
    // let s1: NestedState<EmptyState>
    OpenParenthesis()
    // let s2: NestedState<NestedState<EmptyState>>
    CloseParenthesis()
    // let s3: NestedState<EmptyState>
    CloseParenthesis()
    // let s4: EmptyState
    // state type matches specified return type → automaton is in accepting state
}

This builder would for example ensure that every OpenParenthesis() has a matching CloseParenthesis().

This is simply not possible with the buildBlock based function builders and would also not be possible if we take variadic generics into account.

4 Likes

I have questions about the build* synthesis. So we have a few default implementations:

  • for buildExpression when Expression == Component
  • for buildFinalResult when Component == Result
  • for buildWithLimitedAvailability (using buildBlock)
  • for buildDo (using buildBlock), should we add it later

Given that we can have multiple Component, is one buildExpression generated for each Component? Does the compiler still synthesize these functions even if the type has explicit buildExpression but not of the matching Component type? The same with buildFinalResult.

(3) isn't actually true. The type inference model described in the proposal effectively type-checks each argument to buildBlock independently (they are separate statements in the transformed function). Under the hood, this is accomplished via one-way constraints.

(1) isn't something we should contort new features around. I, too, want variadic generics yesterday, and the 10-view limit in SwiftUI is annoying. But it's temporary: once we get variadic generics, the limit goes away. SwiftUI was intentionally architected so it could take advantage of variadic generics here without breaking the ABI.

Beyond that, it's been widely agreed on this thread that most function builders aren't type-preserving: they don't need the parameters of buildBlock to have different types. That means they fall into the case where normal variadic function parameters work fine. So this problem, while highly visible because of SwiftUI, is unlikely to be widespread across many different function builders.

I agree with (2), that the state-machine formation you describe is more expressive. It's a different transformation entirely, that feeds the results from prior statements into later statements. It's a functional fold operation, whereas the current transformation is more like map. To make the case that your design is better than the proposed one, you need use cases to support the additional expressivity--ones that are clear, and cannot be expressed with the current proposal. You also need a mental model that people can use to understand how the DSL applies. And the hard part of the argument is the subjective one: that the use cases you've provided are important enough, numerous enough, and understandable enough to justify the change to the model that you're proposing.

Personally, I am biased against this change, because I think the resulting model is harder to understand, and that the use cases don't outweigh the additional complexity.

Doug

5 Likes

There is no synthesis of default implementations; we either form calls to these build functions (if they exist), or we don't. Let's take a little SwiftUI example from the proposal:

@ViewBuilder var body: some View {
  ScrollView {
    if #available(iOS 14.0, *) {
      LazyVStack(...)
    } else {
      Stack(...)
    }
  }
}

If you implemented all of buildFinalResult, buildExpression, or buildWithLimitedAvailability, you'd have translated code that looks like this:

var body: some View {
  let v0 = ViewBuilder.buildExpression(ScrollView {
    let v2 /*inferred type */
    if #available(iOS 14.0, *) {
      let v3 = ViewBuilder.buildExpression(LazyVStack(...))
      let v4 = ViewBuilder.buildBlock(v3)
      let v5 = ViewBuilder.buildLimitedAvailability(v4)
      v2 = ViewBuilder.buildEither(first: v5)
    } else {
      let v6 = ViewBuilder.buildExpression(Stack(...))
      let v7 = ViewBuilder.buildBlock(v6)
      v2 = ViewBuilder.buildEither(second: v7)
    }
    let v8 = ViewBuilder.buildBlock(v2)
    return ViewBuilder.buildFinalResult(v8)
  })
  let v1 = ViewBuilder.buildBlock(v0)
  return ViewBuilder.buildFinalResult(v1)
}

For any one of those you don't implement, just drop the call and reference the variable, e.g., without buildExpression, v0 gets initialized directly with the ScrollView instance:

let v0 = ScrollView {

Without buildLimitedAvailability, v5 just becomes v4 (in practice, we wouldn't create v5 at all):

let v5 = v4

And without buildFinalResult, you just return the variable without the call, e.g., the final return:

return v1

Doug

I see, thanks, that makes more sense. What would happen if the return type matches no buildFinalResult (and some are declared), but coincidentally matches the Component from the block. A similar situation can also happen with buildExpression.

I think the transformation should be forced to use the buildFinalResult if at least one is declared. So it would reject Component return type if the API author defines buildFinalResult, but not buildFinalResult(_: Component) -> Component. Likely Component return type wouldn't be expected in such scenario. Same with buildExpression. I haven't thought about buildWithLimitedAvailability enough to be sure, though.

You'll get an error. If a buildFinalResult is declared, we'll form a call to it; if that call fails to type check, it's an error in the program. Whether it's an error in the function builder itself (wrong signature, etc.) or in the function body (used the DSL wrong and the type checker caught it). Same thing with any other function. buildExpression is perhaps the most interesting, because you can use it to say what values are allowed in your DSL. SwiftUI's ViewBuilder, for example, requires all values to conform to View.

Doug

1 Like

Technically that could be done at buildBlock, right? Or do you mean that it allows for fast bail-out since it'd fail at the per-statement type-checking? That's another interesting technique that's only allowed by the buildExpression type checking rule. :face_with_monocle:

You can do this at buildBlock, but buildExpression is more powerful because you can affect how the expression is type-checked.

Doug

2 Likes

I did not know the details regarding the type checking system with forward constraints, so my point (4) can therefore be disregarded to some extent. Is this forward type checking function builder specific or is it a general Swift feature? If it is the former, having a fold-based approach would eliminate the need for this kind of special implementation, having each step being a separate expression that follows the previous expression would enforce this without any special type checking logic.

Also, besides type checking, the enforcement of runtime constraints could be easier to follow in the debugger. If the buildBlock does some checks that require multiple views and that check fails, the assertion error would occur in the buildBlock function with no indication where something went wrong.
With the fold approach, these checks could occur in each next call, thereby failing closer to the cause of the issue.

The fold-based approach would not necessarily be more complex. In most use cases, the builder would be provided by library authors anyways and not enforce any special constraints, thereby behaving like the buildBlock approach without the 10 view limit. If no type system juggling (like in the PDA example) takes place, we would simply generate a state that describes all the views that have been captured so far.
A function builder would only have to have three or four methods instead of ten or more for all the different overloads of buildBlock.

Use cases where the stronger expressivity of the fold based approach could be useful for the following situations:

  • A builder could enforce limits on what elements can be added after another element has been added. For example, they could enforce having only one NavigationView in a view builder or only one <head> in a HTML builder.
  • In a HTML table, they could enforce a constraint such that every row has the same number of cells.
  • In a layer builder for neural networks, they could enforce a constraint such that the output type of a layer matches the input type of the next one without a limit on the number of layers. (If Variadic generics can express something like (LayerType...).(x-1).Output == (LayerType...).x.Input, this would also work with them).
  • Diagnostics could be improved: By specifying an @available attribute for some overloads of the next function, a warning or error can be emitted for a specific element of a function builder closure (like "X should not be used in this place").

Also, regarding variadic generics: I don't believe it would be a great design choice to limit the expressiveness of an existing feature like function builders based on some future feature that could still be very far off, not make it into the language at all in case it is rejected in the evolution process for whatever reason or be blocked by implementation problems.

We added it to make function builder type checking perform adequately, but it's a general implementation mechanism that's fairly important for closures in general. For example, I'm using it to explore multi-statement closure type inference.

That is incorrect. Your approach still depends on one-way constraints to avoid propagating information "backwards" through the closure. Without them, you'll have the same problem Swift 5.1 had, which was fixed in Swift 5.1.1 with one-way constraints.

Yes, but there are other ways to express this structural property without embedding it in the type system, e.g., multiple trailing closures to separate the different bits.

I expect that variadic generics can do this with buildBlock as it is today.

Embedding these constraints deep in overloaded buildNext functions in the type checker is unlikely to improve diagnostics.

It's not a limitation, it's boilerplate. Yes, it's annoying to write 10 different versions of buildBlock, but it isn't a problem with expressiveness.

Doug

1 Like

Hi Douglas,

Is there a way to have a debug mode that will output the translated code you showed? or some way to see this intermediate step would be great.

Thank you,
Chéyo

1 Like

I don't think there's any way to get at the actual source of the builder-transformed code (since we don't have a general utility for going backwards from AST to source code), but if you run swiftc -dump-ast file.swift, the generated output is post-typecheck (and so also post-builder-transform). If there's a better way to view the output of the function builder transformation, I'd love to know as well!

@beccadax has a tracing function builder. As for the compiler, if someone went ahead and implemented pretty-printing for expressions, you could use its AST-printing facilities to see what's happening.

Doug

4 Likes

Hey. Didn't read the entire conversation, but the proposal itself. Any reason why declarations are left alone by the transformation? The only reason I can find in the proposal is: "This allows developers to factor out subexpressions freely to clarify their code, without affecting the function builder transformation." Imho it should be left to the DSL in question to decide wether to allow people to factor out declarations. There are enough other ways to factor out code, but I could imagine reasonable usecases and an appropriate buildDeclaration for such a transform...

Edit:

To elaborate: Say, we declare a variable a. What the transformation has to do now is to wrap the entire scope where a is available into a closure and pass a somehow transformed a to that closure. That is, all you need is a

static func buildDeclaration(expr: GivenType,
                             continuation: @escaping (NewType) -> Component) 
-> Component

In the DSL, the declared variable will then be inferred to have type NewType rather than GivenType.

Usecase:

static func buildDeclaration<T,U, E : Error>(expr: Result<T,E>,
                                continuation: (T) -> Result<U,E>)
 -> Result<U,E>{
             expr.flatMap(continuation)
}

and now assuming we use a functionbuilder with the above method to create a constructor for Result itself, the callsite could look like this:

Result(42){int in 
let a = someFuncReturningResultA(int)
let b = someFuncReturningResultB(a)
someFuncReturningResultC(int,a,b)
}//Result<C,Error>

Edit edit:

Of course, you can support multiple such buildDeclaration functions as long as lookup is unique. In above scenario, you may want to have another buildDeclaration like

static func buildDeclaration<T,U, E : Error>(expr: T,
                                             continuation: (T) -> Result<U,E>)
 -> Result<U,E>{
           continuation(expr)
}

which is essentially the default-implementation if there's no type-transformation from expr to the input of continuation, i.e. if we define that declaring a buildDeclaration that does some type transformation just doesn't override the default implementation, we may not even need to write this extra code.

You may have ended up reading outdated documents @Anachron.

This is the "pitch thread", while the feature now known as "result builders" has already been accepted (after revisions): SE-0289 (review #2): Result Builders - #141 by compnerd so refer to that thread what the status quo is.

What you are mentioning with buildDeclaration sounds to me exactly like the feature I mentioned early on when working on some internal DSLs, and it's right now in the "future directions" section:

The full motivation post is here: Function builders and "including" let declarations in built result