Chunked by sequence of counts?

The recent addition of chunks(ofCounts:) to the Algorithms package is great. I wonder if we would consider a slight broadening to cover a repeating sequence of chunk sizes. My motivating example is that I wanted to write code somewhat like the following:

let str = "66ad8d6e211b33e890e8f8259126ee3a"

let uuidString = str
    .uppercased()
    .segments(ofLengths: [8, 4, 4, 4, 12]
    .joined(separator: "-")

assert(uuidString == "66AD8D6E-211B-33E8-90E8-F8259126EE3A")

I realized that this segments(ofLengths:) function was quite similar to chunks(ofCount:) in the Algorithms package, except that it accepted a series of lengths instead of a single chunk size.

I ended up implementing it as an extension on Collection, and realized that it might be viable for inclusion in the Algorithms package, in a way that the implementation could also be used by chunks(ofCount:). My implementation would need a little bit of refining before I could contribute it, but mostly just things like adding @usableFromInline and such. Is there interest in including this kind of functionality in the package? If so, any thoughts on naming? I currently have called it chunks(ofCounts:), but I've also considered chunks(ofLengths:) as well.

A few more examples to clarify behavior:

// The chunking pattern repeats to the end of the input:
"abcdefg".chunks(ofCounts: [3, 1]) == ["abc", "d", "efg"]

// Passing a single count is equivalent to calling `chunks(ofCount:)`
"abcdefg".chunks(ofCounts: [3]) == ["abc", "def", "g"] 

Thoughts?

4 Likes

From the naming perspective, I think the "repeating" aspect of this API is non-obvious enough that it should be called out in the name somehow, maybe chunks(ofRepeatingCounts:) or chunks(ofCycledCounts:) or similar.

FWIW, the other potential behaviors for extra elements I might expect for the chunks(ofCounts:) naming are:

  • A final chunk is included which contains all remaining elements.
  • A preconditionFailure is raised.

If these other behaviors are desirable maybe it would be worth keeping the chunks(ofCounts:) naming and having an onExcess parameter that let you specify a strategy for handling excess elements?

1 Like

Another design would be if chunks(ofCounts:) simply stopped where the counts ended, similar to prefix(_:). If cycles would be needed the caller could still pass a cycled (non-array) sequence for counts.

1 Like

Ooh, yeah, truncating after counts is exhausted is a good option I missed. And SwiftAlgorithms.Cycle makes cycling super easy. :slight_smile:

1 Like

I originally considered truncating after exhausting the counts, but that seems to go against the general principle of the chunk* methods. The documented distinction between split() and chunked() is that chunked().joined() is guaranteed to return all the same elements. Stopping when the series ends would break that guarantee.

(In the particular case of the UUID example I gave earlier, stopping early would be just fine. But it does seem to introduce inconsistency.)

1 Like

Yesssss, thanks for the developers

Terms of Service

Privacy Policy

Cookie Policy