Support repeating initializers with closures not just values

Erica_Sadun · July 22, 2018, 6:20pm

Several collections support this API or a close facsimile:

public init(repeating repeatedValue: Element, count: Int)

However if you create Array(repeating: UIView(), count: 4), all four elements of the array point to the same view. As a reference type, it makes more sense to generate four distinct views. Similarly, you might want 4 random integers generated with Int.random(in: 1 ... 100) or however that API ends up.

I'd like to pitch a protocol that supports repeated initializers, both for a repeated element and for () -> Element. Something like this (I warn you there's some issues here). The goal is to ensure that any type that conforms to the protocol guarantees that if you can initialize with n copies of a value type, then you can initialize with n instances of a reference type or n applications of a closure:

protocol RepeatingInitializable: Collection {
  
  /// See also: `Repeat.swift`: A collection whose elements are all identical.
  /// https://github.com/apple/swift/blob/master/stdlib/public/core/Repeat.swift
  public init(repeating repeatedValue: Element, count: Int)
  
  /// Allow reference types to generate new instances and value types
  /// to use a closure to create potentially distinct values
  ///
  /// A collection whose elements are all created by an identical generator
  public init(repeating repeatedGenerator: @autoclosure () -> Element, count: Int)
}

Some points:

Unfortunately, the current API places repeating first and the count second. This means adding trailing closures is problematic.
If an autoclosure is used on the second API, then I think you can discard the first API entirely, but then you'd have to limit calls to very simple uses. I'd like to think this is a potential "feature".
Right now, the Repeated struct and repeatElement global function rely on one-item repeated n times and they are public APIs.

Thoughts?

Ben_Cohen · July 22, 2018, 6:44pm

I don't think we need add a protocol in order to add this feature. init(repeating:count:) is currently expressible as an extension on RangeReplaceableCollection (just create an empty one, then append the same element n times). The closure version has the same implementation needs.

The key question to ask when introducing a new protocol is, what new generic algorithms would you be able to write that you couldn't before. Are there algorithms you might write generically that operate on some repeating-initializable thing that isn't also range-replaceable? The only important case of such a type I can think of is Repeated itself (which is immutable). Are there algorithms that might need to generically operate on both range-repleacable collections and Repeated?

Erica_Sadun · July 22, 2018, 6:52pm

I was thinking was that it provided a semantic guarantee that both initializer styles could be used. The only way I could think to provide that guarantee was a protocol.

Ben_Cohen · July 22, 2018, 7:09pm

Range-replaceable provides those guarantees, since both can be expressed in terms of its core requirements.

So this is a similar question of breaking out things like init() and append(_:) into separate protocols i.e. what types are there out there that have the capability, but not the ability to be full range replaceable collections? And what kind of algorithms would you write generically that operated on them?

Nobody1707 · July 22, 2018, 7:13pm

How about this? It doesn't conflict with the current init, and it has a clear use case (collections of reference types).

extension RangeReplaceableCollection {
    @inlinable
    public init(generating generatedValue: @autoclosure () -> Element, count: Int)  { 
        self = .init()
        reserveCapacity(count) 
        for _ in 0..<count { 
            append(generatedValue()) 
        } 
    }
}

Erica_Sadun · July 22, 2018, 7:41pm

Fair enough. Skip the protocol.

What about the problem space and the use-case? Does it meet the bar for something that is appropriate to the language?

Ben_Cohen · July 22, 2018, 8:00pm

Yes, I think so. It isn't limited to use with reference types, either. I find myself writing (0..<n).map { _ in doSomething() } a lot, and I could see this replacing that in a more readable/discoverable form. Plus if it were on RangeReplaceableCollection it would expand that technique to other collection types.*

I wouldn't use an @autoclosure though. The difference between these two things seems really subtle and confusing:

// an array of 4 random bools
let allDifferent = Array(generating: Bool.random(), count: 4)
// an array of 1 random bool, 4 times
let allSame = Array(generating: Bool.random(), count: 4)

Whereas if it wasn't an auto-closure, you would be forced to pass in a function (such as Whatever.init, if you wanted distinct reference type instances):

let allDifferent = Array(generating: Bool.random, count: 4)

* we could add a map-like function to RRC too, that produced any kind of RRC from another sequence, of course.

Erica_Sadun · July 22, 2018, 8:42pm

Yes, exactly this.

What do you think about adding so as not to break the current API (which can be deprecated, and eventually replaced) and allow trailing closures?:

// new, redirects to existing init
public init(count: Int, of repeatingValue: Element)

// new
public init(count: Int, of repeatingGenerator: () -> Element)

hlovatt · July 22, 2018, 9:10pm

I like the forms with count first and I would also add:

public init(count: Int, of repeatingGenerator: (Int) throws -> Element) rethrows { ... }

Also note throws/rethrows.

beccadax · July 22, 2018, 10:13pm

Then you can’t have a repeating array of closures.

Why not leave the existing version alone and make the new one:

init(count: Int, repeating: () throws -> Element) rethrows

That way they’d have distinguishable signatures.

timv · July 22, 2018, 10:33pm

Not very on-topic, but non-empty collections! The standard-library doesn't have them, but they're certainly useful.

Erica_Sadun · July 22, 2018, 10:52pm

I have therefore started holding speculative paint swatches up to bike sheds because I hate the two functions having different orders to their signatures. It's aesthetically displeasing.

Warning: swatches follow.

init(with count: Int, copiesOf repeatedValue: Element)
init(with count: Int, valuesFrom generatedValue: () throws -> Element) rethrows

// and

RangeReplaceableCollection(repeating:, value:)
RangeReplaceableCollection(repeating:, generatedValue:)

RangeReplaceableCollection(producing:, repetitionsOf:)
RangeReplaceableCollection(producing:, callsOf:)

RangeReplaceableCollection(count:, value:)
RangeReplaceableCollection(count:, generatedValue:)

RangeReplaceableCollection(repeat:, value:)
RangeReplaceableCollection(repeat:, generatedValue:)

RangeReplaceableCollection(accumulate:, ofValue:)
RangeReplaceableCollection(accumulate:, ofGeneratedValue:)
 
RangeReplaceableCollection(produce:, ofValue:)
RangeReplaceableCollection(produce:, ofGeneratedValue:)
 
RangeReplaceableCollection(count:, repeatingValue:)
RangeReplaceableCollection(count:, repeatingGenerator:)

jawbroken · July 22, 2018, 11:39pm

Does this provide the 0-based index to the closure? If so, this is the version I've often been interested in.

Karl · July 23, 2018, 12:20am

For the non-closure version, I'd prefer:

extension RRC {
  init(repeating value: Element, times: Int) { ... }
}

Array(repeating: 0, times: 99) // reads more like fluent English

I have a problem with the closure-taking initialiser in this pitch. Since the values may be non-identical, I don't really understand what it's "repeating". Really, what I think you're looking for is what @Ben_Cohen alluded to - a way to initialise an Array (or any RRC) using a count + closure.

I would recommend:

Add Index parameter to the closure (it's often vital)
Remove any mention of "repeating"

extension RRC {
  init(count: Int, initialisingWith elementCreator: (Index) throws ->Element) rethrows { ... }
}

Array(count: 10) { $0 * 2 }
Array(count: 10) { _ in Bool.random() }

Ben_Cohen · July 23, 2018, 1:10am

Once you do this, you really are in the territory of building a less flexible version of map. It may be better just to rip the band-aid off:

extension Sequence {
  // note, requires type context to fix R, and without type context will default 
  // to the version that returns an Array
  func map<R: RangeReplaceableCollection>(
    _ transform: (Element) throws -> R.Element
  ) rethrows -> R {
    var result = R()
    result.reserveCapacity(underestimatedCount)
    for x in self { try result.append(transform(x)) }
    return result
  }
}

Ben_Cohen · July 23, 2018, 1:16am

But isn't the point of non-empty collections that they use the type system to prevent you from creating non-empty ones? Which wouldn't work here – since 0 is a perfectly valid value for count (when determined dynamically). So even if a non-empty collection implemented the protocol, it would have to trap on 0. So when calling any code generic over this protocol, you'd have to be very careful to check that algorithm didn't permit this possibility. Once you've gone that far, you may as well just implement all of RRC and trap on situations that leave the collection empty.

I'm a big fan of the non-empty collection idea, but I don't think it's appropriate as something that would go into the standard library, or as something that standard library protocol design should try and navigate around.

Erica_Sadun · July 23, 2018, 1:44am

It's an interesting point now in the discussion. I'd like to summarize.

I would like to see Swift adopt a variation of init(repeating repeatedValue: Element, count: Int) that allows the caller to initialize a collection using a closure instead of repeating the same value n times. This approach benefits anyone building a collection of reference types, where each instance in the collection represents a distinct identity, and anyone using a closure to generate a value. It would replace the awkward syntax of (0 ..< n).map { _ in doSomething() } just as the existing init replaces (0 ..< n).map { _ in value }.

This naturally fits RangeReplaceableCollection (thank you @Ben_Cohen)
This should not disallow the creation of a repeating array of closures (thank you @beccadax)
This should probably not use an index-driven closure argument, as such becomes a "less flexible version of map" (thank you @Ben_Cohen)

I have not included a design or any other details in this summary. Having reached this point, I'd like to know whether this is an idea with sufficient merit to proceed in Swift Evolution. I do not wish to waste anyone's time on design, bikeshedding, or creating a more detailed proposal without first answering this fundamental question. I would greatly appreciate feedback specifically as to the idea's value decoupled from any further speculation about how it might be realized.

Thank you.

jawbroken · July 23, 2018, 2:04am

I not sure if I really understand this argument. People who want to e.g. construct a specific Array will probably more naturally reach for an initialiser instead of mapping over a sequence of indices that they have to create themselves. And in some sense map itself is already a less flexible version of reduce, but being less flexible isn't a bad thing if it has clarity or brevity benefits.

allevato · July 23, 2018, 3:48am

Back in some earlier discussions a year or so ago, I was somewhat opposed to this as it seemed redundant compared to mapping over a Range, but I've warmed up to it since then. Specifically, I can accept the argument that for the version of the initializer that takes a value to repeat, using (0..<n).map { _ in value } obfuscates the intent somewhat (even if it's theoretically correct), because the only thing that matters is the count and not the actual values in the range that are being ignored.

I do agree that the proposed generator closure should not take an index argument. The moment you introduce semantically important indices to the problem, you are doing precisely a map operation, and it should be expressed that way. We're talking about initializing any RangeReplaceableCollection, not just those with 0-based integer indices, so it would seem odd to couple the closure's index argument to one specific integer range 0..<n. If the user needs to produce a collection of elements that are constructed based on a specific range of values, they should just map over that range, which already gives them the flexibility to compute values based on 0..<n, 1...n, or someIndex..<someOtherIndex.

hlovatt · July 23, 2018, 5:04am

Whilst many people, including myself, are used to writing:

let xs = (0 ..< n).map { index in
    ...
}

I think there would be a lot of people, particularly new to Swift/programming, who would discover more easily and prefer:

let xs = Array(n) { index in
    ...
}

Mainly because they are looking for array initialisers and not thinking about mapping a range. It is so much more obvious that it is creating an array. Also suppose you don't want an array:

let xs = Set(n) { index in
    ...
}

In the future hopefully both:

map will become more flexible about its return type.
There will be more collections.

If either or both of these points above is true then a specific init on Collection would be of value.