Padding Arrays

It would be nice to be able to pad Arrays till a certain minimum length:

let arr = [10,20,30]
let res = arr.padding(repeating: 0, inLength: 7)
print(res) // -> [10,20,30,0,0,0,0]

Incase the original array is already greater or equal to the minimum length specified, in this case 7, then it won't do anything:

let arr = ["A","B","C","D","E","F","G","H"]
let res = arr.padding(repeating: "-", inLength: 7)
print(res) // -> ["A","B","C","D","E","F","G","H"]

This will be useful when the array is of an arbitrary length but we need atleast some minimum length to work with, so padding the array with default values is a conventional way to handle such a scenario.


The basic implementation of this can be as simple as:

extension Array {
    func padding(repeating element: Element, inLength length: Int) -> [Element] {
        guard self.count < length else { return self }
        
        let paddingCount = length - self.count
        let result = self + Array(repeating: element, count: paddingCount)
        return result
    }
}

You can even pull this off in a one-liner:

array + Array(repeating: thing, count: max(targetCount - array.count, 0))
3 Likes

Foundation framework already provides a great API to achieve this with Strings:

Do you have a real-life example where else this might be useful for non-string arrays?

Building a NxN grid of enum values out of a ragged input (which might be text that you could pre-pad...but still). I vaguely recall having done that for some rogue-like I started writing, but have not actually finished. I just used Array.repeating and min though.

There's a bug here. That min should be a max.

1 Like

I have seen String's padding(toLength:withPad:startingAt:). Infact the first version of my pitch content mentioned it but I find the startingAt variable problematic as it can be misunderstood very easily (atleast for me).
I also felt that to stay consistent with the more intuitive Array(repeating:count:) declaration a better inspiration.

Real-life example: Custom Graph; Bar Chart
I have a 7-day bar chart widget in SwiftUI. Lets say I have only 3 data points from the current week. My SwiftUI layout does a simple ForEach on the datasource, if I don't pad the array then it will show only 3 bars. However, I always want to show 7 bars. The first 3 with real data points and the remaining with default values. This way my SwiftUI code and layout remains consistent.

I am sure this can apply to any area where a consistent length of data points are required.

For more flexibility it would be a nice option to specify whether to pad from front or back.

1 Like

I think making sure that there are exactly 7 elements and at least 7 elements are quite different things. At least different enough that it should be treated differently from API standpoint.

Yes, for sure. padding is conventionally for exact length so in that case the API can simply read:

arr.padding(repeating: 0, length: 7)

In my initial example, I was anyways looking for N data points but was just thinking ahead that if I do get more than N then I need not pad and just consume all.
If I ever need to clamp then I can simply use prefix on it like: arr.padding(repeating: 0, length: 7).prefix(14). That's atleast 7 and atmost 14.

But looking at the usage of padding in other languages as well as in Swift String, it's clamps to N with padding if necessary.

Not always so great though, eg:

import Foundation
print("ab๐Ÿ‘ฉโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆcd".padding(toLength: 7, withPad: "_", startingAt: 0))
// Will print "ab๐Ÿ‘ฉโ€๐Ÿ‘ฉ"
1 Like

Length here is measured in Unicode scalars, I believe.

Yes (I just thought it might be worth pointing out that potential gotcha using an example).

An idea for the in-place version:

extension RangeReplaceableCollection {

    /// If not already at least the given length, appends enough copies of a
    /// given element to reach that length.
    public mutating func pad(toLength count: Int, with element: Element) {
        append(contentsOf: repeatElement(element, count: Swift.max(0, count - self.count)))
    }

    /// Ensures the collection is the given length, either by truncating a
    /// suffix or appending copies of the given element.
    public mutating func setLength(to count: Int, extendingWith element: Element) {
        precondition(count >= 0)
        if let cutoff = index(startIndex, offsetBy: count, limitedBy: endIndex) {
            removeSubrange(cutoff...)
        } else {
            pad(toLength: count, with: element)
        }
    }

}

I have both only-a-minimum and always-exact-length versions. I thought the English reading sounded better with the updated length first and new value second.

I think the second should be setCount(...), since "length" doesn't appear anywhere else in the Collection API.

Also, it would be nice to have a version of this that accepts a closure as the "element" parameter so you can properly work with reference types:

var views: Array<UIView> = ...
views.setCount(to: 42, extendingWith: UIView()) // will append many references to the same view

vs

var views: Array<UIView> = ...
views.setCount(to: 42, extendingWith: { UIView() }) 
// will invoke this closure as many times as necessary, each time creating a new instance

Would @autoclosure take care of repeated callings? Or does that call its "closure" just once and reuse the result?

If we use a (regular) closure, what happens if it throws? Or are we mandating a never-throwing closure?

@autoclosure is just a closure, so it'll repeatedly be computed. Though I find it to be bad taste to call @autoclosure repeatedly. It does look like a single-time value from the call site.

Maybe you can rethrow?

No; I think it should be two separate methods. Otherwise, this code...

let view = UIView()
someViews.setCount(to: 42, extendingWith: view)

... does something very very different from this code:

someViews.setCount(to: 42, extendingWith: UIView())

The first one appends many references to the same view. The second one appends many instances of different views. That, IMO, would violate the principle of least surprise. Forcing the use of a closure if you want unique instances would make it clear that it's a different way of getting new values vs passing in a single value.

The problem is the integrity of the receiver. If a mutating function can throw, we should aim for all-or-nothing dynamics; either the method successfully changed the receiver's state, or it got rolled back by the time the catch detected the error. Here, padding a 4-element collection to 20 could throw halfway through and we get a junk state of a 12-element collection.

Hmm, should we track the original length, put a do-catch around the main code, then removeSubrange on the incompletely-done new elements? (Can't store the first new element's index because RangeReplaceableCollection routines may reset it. It "works" for Array, but not RRC in general.)

A new possibility:

extension RangeReplaceableCollection {

    /// If not already at least the given length, appends enough copies of a
    /// given element to reach that length.
    public mutating func pad(toLength count: Int, with element: Element) {
        append(contentsOf: repeatElement(element, count: Swift.max(0, count - self.count)))
    }

    /// Ensures the collection is the given length, either by truncating a
    /// suffix or appending copies of the given element.
    public mutating func setCount(to count: Int, extendingWith element: @autoclosure () -> Element) {
        precondition(count >= 0)
        if let cutoff = index(startIndex, offsetBy: count, limitedBy: endIndex) {
            removeSubrange(cutoff...)
        } else {
            pad(toLength: count, with: element())
        }
    }

    /// Ensures the collection is the given length, either by truncating a
    /// suffix or appending the results of repeatedly calling the given closure.
    public mutating func setCount(to count: Int, generatingExtensionsWith generator: () throws -> Element) rethrows {
        precondition(count >= 0)
        if let cutoff = index(startIndex, offsetBy: count, limitedBy: endIndex) {
            removeSubrange(cutoff...)
        } else {
            let currentCount = self.count
            do {
                for _ in currentCount..<count {
                    append(try generator())
                }
            } catch {
                let oldEnd = index(startIndex, offsetBy: currentCount)
                removeSubrange(oldEnd...)
                throw error
            }
        }
    }

}

I donโ€™t like the name โ€œsetCountโ€, and this seems pretty niche and hard to justify adding. IMO this works better as 2 more fundamental operations, each of which are more worthy of inclusion than they are combined:

collection.truncate(from: x)
collection.extend(to: x, with: defaultValue)

(Where truncate would be like our existing remove, except it wouldnโ€™t fail if the collection is smaller than x). You might also want variants for left/right extension.

3 Likes

This does not guarantee that the standard library remains concise.

Terms of Service

Privacy Policy

Cookie Policy