Add @ArrayBuilder to the standard library

I'm not sure. The consume keyword might also be required.

let $__builder… = ArrayBuilder.buildPartialBlock(accumulated: consume $__builder…, next: $__builder…)
//                                                            ^~~~~~~

  • If this is destined for the standard library, then should the build… functions and @resultBuilder attribute be added directly to the existing Array type?

  • Otherwise, as suggested by @coenttb, should a nested Array.Builder enum be considered?

1 Like

I feel like it would be weird for Array itself to be a result builder. @Array being a result builder attribute feels unusual.

For call sites, @ArrayBuilder seems much more idiomatic than @Array.Builder. It seems best to follow the same general structure and naming scheme of @ViewBuilder (the most well-known result builder).

Do you have benefits in mind beyond the spelling of the attribute itself?

Adding the buildPartialBlock methods to Array certainly simplifies the back deployment discussion (no new types), but it feels particularly unusual to me.

1 Like

Like others, I tend to feel that helpful utilities like this (as opposed to fundamental building blocks) should go into packages rather than the standard library. As other folks have mentioned, putting things like this in the standard library opens up a whole conversation about back-deployment that, if the type can't be back-deployed for some reason, means that many real applications can't start using it until they're willing to shed older OS versions. I wish we would take more advantage of the package ecosystem instead of defaulting to "thing is useful so thing should be in stdlib".

That aside, this would be really useful (in any distribution form) for a couple reasons:

  • We encourage making things only mutable for the shortest duration necessary, and composing arrays that are any more complex than just a static literal makes expressing that uglier. The result builder makes it easy to write more complex dynamic arrays that combine single elements, arrays of elements, and conditions very easily.

  • There have been numerous threads in the past about wanting to support something like this:

    let flags = [
      "abc",
      "def",
      #if os(Linux)
      "ghi",
      #endif
    ]
    

    and using a result builder makes this just work and we can punt on the compiler-conditional lists question for a bit longer.

I would also expect the implementation of the array builder to perform no worse than creating the array the imperative way (in terms of time complexity, number of allocations, etc.). If it couldn't do that, I wouldn't want us to accept it.

One improvement I would make: the usage pattern shown involves computed variables, so they would be re-evaluated every time they're read. I'd rather see an extension that adds an initializer to Array that takes an @ArrayBuilder closure. Then, you can write this:

let someArray: [String] = Array {
  "abc"
  "def"
  if someCondition {
    "ghi"
  }
}

As I was typing that out, I discovered that the following work:

let someArray: [String] = Array { ... }
let someArray: [String] = .init { ... }
let someArray = Array<String> { ... }

But what I really wanted to write, and what I think a lot of people would reach for, would be this:

let someArray = [String] { ... }

which doesn't compile. It looks like the logic that tries to treat a called single-element array literal expression (e.g., [Int](repeating: 0, count: 10)) as an initializer call doesn't work for the case of just a trailing closure, so you have to write this:

let someArray = [String]() { ... }

which is unsatisfying. Might be nice to fix that, if it wouldn't break source somewhere else.

It would also be a shame if builders like this were just limited to arrays. I could see a dictionary one being useful as well, but there's no obvious syntax for a dictionary element in isolation (if we didn't want to abuse an operator for this, we'd have to introduce a special key-value pair type).

To go even further, I think the ultimate end state would be to be able to use this anywhere that an array literal is allowed (so, anything that supports ExpressibleByArrayLiteral), but that would require a whole slew of compiler changes that go beyond what's being pitched here.

4 Likes

It isn’t necessarily an either-or thing. Wouldn’t it be cool if the standard library itself were assembled from Swift packages which were also independently obtainable? I would prefer that over a world where development of any new project begins by assembling a tower of dependencies, which is how life is in some other language ecosystems.

My understanding is that that boat has mostly sailed for Apple platforms since Swift is in the OS and used by other system frameworks. Unless the distribution model you describe was made possible, we should probably limit discussions to what's actually realistically feasible today.

2 Likes

Without going into details about how the OS-packaged stdlib is built, that doesn’t preclude the actual toolchain compilation step from invoking swift build. For standalone toolchain builds, even on Apple platforms, that could involve cloning the package from source control or the package could already be stored locally as part of setting up the source tree.

Sure, but that's a far cry from end users actually being able to use those packages as dependencies when the symbols might collide with those in the OS. Anyway, I think a discussion about lifting the limitations of standard library modularization/back-deployment would be much better served by its own thread so this one can stay focused on the pitched idea.

1 Like

The syntax looks pretty nice. What I'm thinking about is perfomance.

Creating Array this way might cause lots of CoW operations.
The second point is that when creating Array manually, we can rather often reserveCapacity to eliminate underlying buffer copies when new elements added.

Also, creating array via literal (e.g. [first, second]) is more performant than reserving capacity for elements and appending them one by one. It is desirable that performance at least will be no worse than reserving capacity and repetitive append calls.

Creating a Set with similar builder is also useful as a general purpose feature.
Thinking further, next candidates are Deque and OrderedSet.

My assumption is that once we open this door, quite soon we realize that such feature is useful for many other collection types.
As a feature it will evolve over time.

Also, using a macro implementation may offer better performance. A naive approach that come to my mind is to check all optional values whether they nil or not and reserve capacity eagerly for non-nil values. Collections have count property which can be accumulated. Once total capacity is counted, it can be reserved and then elements are added without intermediate allocations.

While the overall raw idea is good, providing some robust implementation need time and should be battle tested.

Because all of these reasons, I suggests to land it to swift-collections.

2 Likes

no need to make a var result = accumulated here, consuming parameters are mutable. It is enough to just:

accumulated.append(next)
return accumulated

Despite that, accumulated will be copied and then mutated, in comparison to inout.

2 Likes

Started a discussion in the swift-collections subforum about whether or not folks think an @ArrayBuilder would be a good for the collections package: @ArrayBuilder in swift-collections?

I think there is at least one language improvement that we should make regardless:

And potentially another, pending investigation on the source compatibility side:

The result builder inferred generics change is new language functionality so probably requires an evolution proposal. Since I have an implementation for that one, I will submit a proposal for it.

I’ll also investigate what it would take to support let someArray = [String] { ... }. It’s possible that’s “just a bug fix” that doesn’t require an evolution proposal. If we decide it does require an evolution proposal, we could bundle them together.

2 Likes

It seems to be possible to add different implementations for copyable arrays and sets. (And perhaps also for noncopyable containers in the future.)

@resultBuilder
public enum ArrayBuilder<Result: ExpressibleByArrayLiteral> {}

extension ArrayBuilder where Result: RangeReplaceableCollection {
  /* buildPartialBlock, etc. */
}

extension ArrayBuilder where Result: SetAlgebra {
  /* buildPartialBlock, etc. */
}
1 Like

What I meant was more about imagining a builder closure being used syntactically anywhere that an array literal was allowed, since ultimately ExpressibleByArrayLiteral just passes an array masquerading as variadics into the initializer. In other words,

let s: SomeArrayExpressibleType = {
  "abc"
  "def"
  if foo { "ghi" }
}

But the idea is kind of crazy and I definitely haven't thought it all the way through.

4 Likes

I don’t think it’s a good idea to support this officially given how poorly result builders compose with leading dot syntax.

let colors = [Color].build {
    .red
    .green
    .blue // parses as .red.green.blue instead of .red; .green; .blue
}

This isn’t a big problem with RegexBuilder and ViewBuilder since their element types are provided by their libraries, but doing this with an arbitrary type is asking for confusion and frustration, and works against Swift’s promises of “no semicolons needed” and “simple things that compose”.


Perhaps, instead, we should consider extending the existing array literal syntax.

let arguments = [
    "format",
    "--in-place",

    if recursive {
        "--recursive",
    }
    // similar syntax for switch, for, etc.
]

The above syntax is just as expressive as a result builder, while still fully supporting leading dot syntax and remaining compatible with if and switch expressions. Because it’s not limited by the constraints of result builders, it could use information like the number of always-included elements and the number of elements included for each condition to optimally grow the array, minimizing the number of intermediate reallocations. In the future, we could even add an ExpressibleByConditionalizedArrayLiteral protocol (name can be bikeshedded later) to let programmers use this kind of literal with their own types.

6 Likes

I think much more useful than an ArrayBuilder would be a SequenceBuilder. You can almost make it using just standard library types; all that’s additionally needed is Chain2Sequence and an Either[Sequence] type. I made one here, though I’d admittedly probably do some things differently if I were to start from scratch. When I benchmarked it, it performed marginally but noticeably better than a version that just used Arrays.

That being said, I think this would better belong in swift-collections or swift-algorithms; probably the latter.

EDIT: I made a better version here: swift-fun/Sources/SequenceBuilder/SequenceBuilder.swift at develop · junebash/swift-fun · GitHub

8 Likes

I am huge +1 for CollectionBuilder. The collection types mentioned are a good enough reason as it is. In addition, it would be awful to leave the user on by themselves as soon as they want to switch from [T] to some more appropriate type just because they opted for a cool initialization syntax.

We can have ArrayBuilder as `typealias ArrayBuilder = CollectionBuilder<[T]>` if we want.

1 Like

I would argue that this is a separate task: the topic calls for a resultBuilder-based way to initialize arrays (and, possibly, other collections), while what you’re offering is basically a new (possibly opaque, composite) Sequence type that can be built via resultBuilder.

I would personally love to see both added.

I would argue that it’s not a separate task. Any conformer to RangeReplaceableCollection and OptionSet, including Array and Set, could trivially add a new initializer using a SequenceBuilder.

extension RangeReplaceableCollection { // same signature for `OptionSet`
  init(@SequenceBuilder<Element> _ build: () -> some Sequence<Element>) {
    self.init(build())
  }
}

If we have SequenceBuilder, we really don’t need ArrayBuilder.

CollectionBuilder may be slightly more tricky, but might also be possible without a lot of additional code.

5 Likes

Let me put this in another way: we can have CollectionBuilder<T> where T: RangeReplaceableCollection, but we can’t have SequenceBuilder<T> where T: Sequence (or any other standard sequence-like protocol for that matter).

Sure, we can add non-lazy content-copying initializers for both, but that’d be just sweeping this fact under the carpet. Those are two different things that achieve slightly different goals: one builds-up a user-provided type, and another initializes a library-provided one.

(and we probably could use both; SequenceBuilder for at least the laziness, and CollectionBuilder for at least the performance)

I suppose my question is… why do you want that? I really don’t see much of a benefit that couldn’t be achieved with the sequence builder, beyond maybe one level of indentation in certain contexts.

My primary concern here is performance. Initializing from a sequence necessarily requires a full iteration and element-by-element copying, whereas constructing the target collection directly from its subsequences may be able to take advantage of guarantees provided by the collection’s implementation. For example, when building a contiguous Array of BitwiseCopyable elements, entire blocks could reasonably be memcpy’d together. Similarly, for rope-like data structures, no copying might be required at all.

2 Likes