Using ClosedRange<Int> as a Dictionary Key

toph42 · July 31, 2023, 5:09pm

Imagine a table of results based on ranges of inputs, such as the following:

Input	Result
1-4	Foo
5-8	Bar
9-10	Bad

I thought I could model this as a Swift Dictionary with the [ClosedRange<Int>:String] type and find the corresponding result from an Int, but that’s not doing what I wanted because ClosedRange<Int> ≠ Int.

let table: [ClosedRange<Int>:String] = [1...4:"Foo", 5...8:"Bar", 9…10:"Baz"]
let one = table[1] //Error: No exact matches in call to subscript.
let oneAsRange = table[1…1] //No error, but is nil
let fullKey = table[1…4] //"Foo" as expected, but I want to find the range key that my single input falls into

So I know I probably need to extend Dictionary where Key == ClosedRange<T> to add a new subscript that takes a T and returns a Value? for the key that contains the T, if any (or perhaps returns [Value] since there could be overlapping ranges that each contain the T and [] is just as useful as nil here), but I’m not quite sure how to tackle this.

toph42 · July 31, 2023, 5:37pm

I found I can use the following, but I’m thinking it may be better to get all, just in case there is an overlap (there shouldn’t be, in my case, but out of principle):

table.first(where: {$0.key.contains(1)})?.value //Optional("Foo")

Karl · July 31, 2023, 5:42pm

Sounds like a Dictionary is not the optimal data structure for your problem.

I created a data structure which you might find useful: the SegmentedLine. You'll also need the binary search function from here, but otherwise everything is in that one file.

I suggested adding it to swift-collection because I find it's not entirely uncommon that I need this kind of thing. I use it for Unicode databases - mapValues followed by combineSegments is really effective at generating simplified, compacted data.

A SegmentedLine is a one-dimensional space, where every location is assigned a value.

SegmentedLine is effective when entire regions are assigned the same value.
For example, we can build a simple number line to tag ranges of integers; in this case,
we're tagging each range with an optional string.
var line = SegmentedLine<Int, String?>(bounds: 0..<100, value: nil)

// After setting values <5 to "small" and values >10 to "large",
// the gap is left with its previous value, "medium".

line.set(0..<20,  to: "medium")
line.set(0..<5,   to: "small")
line.set(10..<60, to: "large")
print(line)
// | [0..<5]: "small" | [5..<10]: "medium" | [10..<60]: "large" | [60..<100]: nil |
The locations on a SegmentedLine do not have to be integers - they can be any Comparable type,
including dates, strings, Unicode scalars (for building character sets), or Collection indexes.

In the latter case, we can model a Collection's elements as a line from its startIndex to its endIndex,
allowing us to annotate regions of any Collection. In a way, it can be used as a generalized AttributedString.
let string = "Bob is feeling great"

// Create a SegmentedLine for the collection's contents.
// Start by setting a font attribute over the entire string.

var tags = SegmentedLine(
  bounds: string.startIndex..<string.endIndex,
  value: [Font.custom("Comic Sans")] as [Any]
)

// Set each word to a different color.
// Use 'modify' to append the attribute, but only for the region
// we're modifying.

for word: Substring in string.split(separator: " ") {
  tags.modify(word.startIndex..<word.endIndex) { attributes in
    attributes.append(Color.random())
  }
}

// Check the result.
// - ✅ Every segment still contains the font attribute.
// - ✅ Each word also contains its individual color attribute.

for (range, attributes) in tags.segments {
  print(#""\#(string[range])""#, "-", attributes)
}

// "Bob"     [Font.custom("Comic Sans"), Color.orange]
// " "       [Font.custom("Comic Sans")]
// "is"      [Font.custom("Comic Sans"), Color.green]
// " "       [Font.custom("Comic Sans")]
// "feeling" [Font.custom("Comic Sans"), Color.pink]
// " "       [Font.custom("Comic Sans")]
// "great"   [Font.custom("Comic Sans"), Color.yellow]

toph42 · July 31, 2023, 6:18pm

Thanks! This is very cool. I’m definitely going to check this out.

In the mean time, though, since this is just for a hobby project I’m just going to use the filter I found on Dictionary and extend it to make it do what I wanted:

table.filter({$0.key.contains(1)}).values

So I wrote this extension to add a subscript:

extension Dictionary where Key == ClosedRange<Int> {
    subscript(whereKeyContains subkey: Int) -> Dictionary.Values {
        return self.filter({$0.key.contains(subkey)}).values
    }
}

With that extension now I can just do this:

table[whereKeyContains: 1] //["Foo"]

One strange thing doing this, I expected I could generalize this further by changing the Int to ClosedRange’s associatedType Bounds but it wouldn’t let me, even when I added ,Bounds: Comparable to the where clause.

AlexanderM · July 31, 2023, 6:21pm

IDK how much performance matters, but in that case, using KeyValuePairs might be more optimal (the keys would be faster to iterate), and it removes the confusion around how it's expected to be used

toph42 · July 31, 2023, 6:25pm

It matters little here, but most of my hobby projects are so I can learn new things, so I would like to understand how that’s better for performance.

Edit: Oh! KeyValuePairs isn’t another way to look into the Dictionary, it’s another type altogether!

AlexanderM · July 31, 2023, 6:29pm

It depends on the exact structure of Dictionary's hash table, but if the keys aren't stored contiguously in memory (e.g. if separate chaining is used, which it's probably not, open addressing is more popular these days), then iterating over them can involve indirection and pointer chasing, which is worse for CPU cache performance.

KeyValuePairs just puts the keys in a contiguous array at a consistent stride, which is a much more predictable memory access pattern for CPUs to predict and prefetch

toph42 · July 31, 2023, 6:31pm

At this point everything that’s going into these data structures will be DictionaryLiteral constants, so this does seem like a great way to go here.

AlexanderM · July 31, 2023, 6:34pm

It might not be a good choice ultimately, but just FYI, KeyValuePairs conforms to ExpressibleByDictionaryLiteral, so you can write code like:

let looksLikeADict: KeyValuePairs<ClosedRange<Int>, String> = [
    1...4: "Foo",
    5...8: "Bar",
    9...10: "Baz",
]

toph42 · July 31, 2023, 6:37pm

Some operations that are efficient on a dictionary are slower when using KeyValuePairs . In particular, to find the value matching a key, you must search through every element of the collection.

The documentation indicates that perhaps this isn’t the best case for me.

AlexanderM · July 31, 2023, 6:38pm

you must search through every element of the collection.

That's precisely what code like this:

is already doing.

Dictionary would only be able to help accelerate lookups by exact key matches (in your example, searching with ClosedRange<Int>, not individual Ints).

tera · July 31, 2023, 8:52pm

I'd also consider a simpler:

switch x { case 1...4: "Foo"; case 5...8: "Bar"; case 9...10: "Baz"; default: fatalError() }