`RangeExpression`s with exclusive lower bounds

mattpolzin · June 28, 2019, 5:16am

I searched a fair bit of forum history and did not see previous discussion on this topic; apologies if I missed it.

Swift currently has a type and an elegant shorthand for any range with either no lower bound (..., ..<3, etc.) or an inclusive lower bound (1..., 1...5, etc.). It does not yet have a concise way to write ranges with exclusive lower bounds.

The range of integers greater than 1 and less than 5 can be written 2..<5 or (1+1)..<5 but neither is especially obvious and when dealing with other Bound types things get even messier. For Double, you might write 1.0.nextUp..<5 but this is neither intuitive nor does it bear any resemblance to the representation of other ranges with similar semantics but different Bound types.

I would love to see new RangeExpression types for ranges with exclusive lower bounds in the standard library and I personally find x<.., x<.<y, and x<..y to be intuitive ways to round off the range operators.

What do you all think? Am I missing a good reason to leave these last few range types out or is it just a matter of someone getting around to proposing and adding them?

Pampel · June 28, 2019, 11:30am

I've wondered the same thing, and like the proposal.

In some ways x>..y / x>.<y could be see as nicer but x<..y and x<.<y seems more correct from a mathematical point of view, so I'd vote for the latter.

xwu · June 28, 2019, 11:48am

This has come up before, but you’re right it’s hard to find on this list likely because it was so long ago.

One major barrier is that the obvious spellings (and almost all variations on them) are not valid in Swift, because operators in Swift that contain . must begin with .

Changing the rules to allow the desired operators is one approach but gets bogged down because in Swift any character must be used either for operators or for identifiers, and the divide among them is currently rather haphazard and in need of refinement. It’s a much larger topic to divide emoji and math symbols rationally among operator and identifier characters.

The alternative approach is to make these range operators a one-off exception, but that does complicate the language.

The question is whether such a complication can be justified to fill out the lesser used range types—they do have some role but, to be honest, several years after proposing to add them, I don’t miss them.

All of this is to say that there’s no point in bikeshedding the spelling here. What would progress the topic is coming up with compelling use cases. It’s not enough that it would “fill out” the ranges—what scenarios have you encountered where using such a range instead of the alternatives (manually comparing end points, for example) demonstrate that such ranges would make the language dramatically more expressive?

mattpolzin · June 28, 2019, 1:36pm

Thank you for all that context.

The alternative approach is to make these range operators a one-off exception

Although I am still interested in having the discussion, this is probably not a hill I am willing to die on.

I'd be interested to hear others' use-cases. My own use for having a complete set of range operators is to provide a concise non-lossy way to take advantage of ranges at API boundaries and when performing language-agnostic encodings.

Let's say (not so hypothetically) that I am both developing a Swift API and writing code that produces a JSON Schema specification. This specification uses 4 keywords (minimum, exclusiveMinimum, maximum, and exclusiveMaximum) to encode ranges into JSON. I would like to provide a Swifty initializer/interface for a type that encodes/decodes to that specification.

Without RangeExpression support, I find myself writing initializers like the following

init(minimum: (Double, exclusive: Bool)? = nil, maximum: (Double, exclusive: Bool)? = nil)

I actually find that fairly elegant, except when compared to

init(range: RangeExpression = ...)

Excuse the strawman syntax here, I am using an existential parameter that is not yet allowed as a stand-in for an implementation detail I am not putting up for discussion.

The problem with this more concise approach currently is that I have to choose between losing information (representing an exclusive lower bound by the next possible inclusive lower bound) or losing intuitive concision by providing a non-standard function and type as part of my API.

jrose · June 28, 2019, 3:52pm

Ignoring the spelling for a bit, what's the use case for exclusive-lower-bound ranges? I can see that it can be used sensibly with floating-point values (definitely), integers (sure, I guess), and indexes (questionable but implementable), but without a concrete use case I wouldn't necessarily want to commit to doing it.

mattpolzin · June 28, 2019, 4:46pm

My use-case is meta -- I am trying to write Swift that can properly produce schemas adhering to an existing specification (draft-wright-json-schema-validation-00) that allows for an exclusive minimum value to be defined. Note that I am writing a type that serializes to a schema following the rules of the JSON Schema specification, not serializing values that need to be valid according to some existing schema (... ignoring the subtlety that the schemas my type serializes can indeed be validated against the JSON Schema meta-schema).

I could come up with a hypothetical reason for needing to say "provide my JSON API with a number strictly greater than x" but I don't have one readily available. I just know that I need to be able to represent that reality because it is stated as part of the specification I am following.

Assuming that we are not debating whether someone could possibly want to write a schema that said "provide me a floating-point value greater than 2.0," I lose information if I try to use Swift RangeExpression types.

Assume I am trying to decode the following schema (and I am going to ignore maximums to focus on the problem being discussed):

{ "type": "number", "minimum": 2.0, "exclusiveMinimum": true }

So I create a type like

struct NumberSchema {
    let range: PartialRangeFrom<Double>
}

and I write my own encoding and decoding functions. The best I can do when decoding the example schema above is to make the range equivalent to 2.0.nextUp.... This will work if I then go to validate some data that is supposed to fit the schema, but I run into problems on encoding because I have thrown away some information. I will produce something like

{ "type": "number", "minimum": 2.0000000000000004, "exclusiveMinimum": false }

This is not accurate, so using an existing RangeExpression type was a non-starter even though in every respect except for representing exclusive-lower-bound there exists a RangeExpression type that is a perfect fit.

No big deal, because I can just make my Swift type more closely resemble the schema. However, now I want to provide the convenience of writing the schema in Swift in the first place using this type (for all the reasons I like writing anything in Swift).

The initializer for this NumberSchema type is worlds more concise if it can take advantage of the expressiveness of built-in RangeExpression operators (as I was describing in my previous comment in this thread).

mattpolzin · June 28, 2019, 5:00pm

For what it's worth, I am aware of how niche my use-case is. Nevertheless, it is definitely a real use-case; I was not contriving an example, but rather genuinely found myself wanting to provide a Swift interface I could not write.

jrose · June 28, 2019, 6:12pm

For what it's worth, you can still have a type that uses the existing operators for the common cases and has a less pretty form for the full case in the mean time.

SDGGiesbrecht · June 28, 2019, 9:22pm

Some possible workarounds in the event ease of reading is significantly more important that ease of input:

// U+2024 (One dot leader)
// Warning: Would break if NFKD or NFKC were applied.
infix operator <․․
infix operator <․<

// U+2027 (Hyphenation point)
infix operator <‧‧
infix operator <‧<

// U+22C5 (Dot operator)
infix operator <⋅⋅
infix operator <⋅<

// Or completely overhauling to mimic real mathematical notation:

// U+2212 (minus sign) [as the line segment]
// U+2219 (bullet operator) [as a closed endpoint]
// U+2218 (ring operator) [as an open endpoint]
infix operator ∙−∙
infix operator ∘−∙
infix operator ∙−∘
infix operator ∘−∘

// U+00D7 (multiplication sign) [standing in for the variable “x”]
// U+2264 (less‐than or equal to)
// U+003C (less‐than)
infix operator ≤×≤
infix operator <×≤
infix operator ≤×<
infix operator <×<

mattpolzin · June 29, 2019, 12:01am

Thanks for the suggestions. I'd say for me in this case, "intuitive entry" wins out over "ease of reading." Ideally that would mean the user can specify any possible range using familiar standardized operators, but barring that I think the consistent, albeit much less concise, approach is likely going to be my preference.

[EDIT] that being said, I want to want to use ∙−∙ and its relatives for how visually clean the result is!

CTMacUser · June 29, 2019, 6:30am

An other-side half-open range came up on discussions of linked lists. @tayyab first suggested >**, which I ran with.

GreatApe · June 29, 2019, 2:27pm

I think this operator makes total sense, since you can make the upper bound exclusive, why shouldn’t you be able to do the same to the lower bounds? After all, in most (if not all) cases it would be enough with a ... operator, just less clear. Similarly it is less clear at the lower bound, like the OP mentions, that you want an open range greater than 1, but there is no 1 in the code.

The only reason I can see that we only have an exclusive version for the upper bounds is that you’ll often use this with the count of an array as upper bounds, and they have 0 based indexing.

I think symmetry is a very strong argument here, especially since there won’t be any clashes with existing operators or source breaking changes.

Lantua · June 29, 2019, 3:00pm

A small digression that mixing count with Index is a little dangerous. It works with Array and only Array (not even ArraySlice). Unless you're sure that you're dealing with Array, now and in the future, it's better to use startIndex and endIndex when subscripting.

The point remains with ..< existing because of startIndex and endIndex. Though it applies more strongly to exclusive upper bound, because of exclusion asymmetry between startIndex and endIndex.

GreatApe · June 29, 2019, 4:00pm

True. Actually my example wasn’t very good. What I think people mostly use Array.count for, certainly I, is not i ranges but in comparisons. If i < views.count. Even then you should probably prefer views.indices.contains(i), but still.

xwu · June 29, 2019, 5:15pm

Again, these are not interchangeable for ArraySlice or many other types. If i is an index then the first comparison is incorrect, and if it is an offset from the first element then the second comparison is incorrect. It will work for an Array but, the moment you slice one, this becomes a problem.

GreatApe · June 29, 2019, 5:32pm

Well I am explicitly talking about arrays here. Plus my point was in support of what you are saying now.

idrougge · June 30, 2019, 12:06am

While at first I thought "why not just transform 1<..<10 into 2..<10, I realised that you have a use case and that you just as well could argue that 2..<10 may be replaced with 2...9.
This is useful (for some) without adding complexity to the language. If you already understand ..<, you don't need to consult any manual to understand <..<.

CTMacUser · June 30, 2019, 5:39am

That's trivial for Strideable types with an integer stride, but not so much for other types.

GreatApe · July 1, 2019, 2:49pm

Is there any inherent difference between the lower bound and the upper bound, apart from how they are commonly used? I mean, why would it be harder to exclude the lower bound than the upper, which already works?

Unless there are strong technical reasons, I think the absence of <.. and <.< is quite odd, it's as if we had < but not >.

Lantua · July 1, 2019, 3:00pm

It's mentioned earlier that operators that contains dots(.) must begin with dots.