Same thing happened to me. In this very same thread!
FWIW, it occurred to me that there's a general problem in implementations like your. For Collection
, you cannot index relative to .endIndex:
https://developer.apple.com/documentation/swift/collection/2949866-index
The offsetBy:
parameter cannot be negative unless it's a BidirectionalCollection
.
As I’ve shown above, you can implement it on top of sequence operations. That only means, the slicing relative to endIndex
has different performance characteristics depending on the underlying Collection
. There is no problem.
We have to make all this work on Sequence
too! The slicing is not only for Collection
s.
Indeed, but it's not obvious that the Sequence solution should be a subscript rather than function.
I'm still playing around, looking for a more obvious syntax that might make everyone happy without being too obscure. I'm wondering if the fault isn't in trying to use the range operators, which look terrible when juxtaposed with syntax that means offsetting. Perhaps a two-parameter subscript would be clearer:
c [offset-expression, offset-expression]
// for example:
c [2, -3] // meaning of '-3' still obscure
// or even:
c [start: 2, end: -3]
c [start: 2, start: 4]
c [end: -5, end: 0]
or something along that line.
Is this a typo or does it mean "from start+2 to start+4" ?
It was intended to mean “from start+2 to start+4”. I know it's not good, but we haven't found good yet.
Just checking.
Side note, one issue I have with any Swift ranges is that they are "start/end" based with no option for "start/length". For example, a language which shall not be named supports the following:
s(x : l)
where x is the starting position and l is the length of the "range". Whatever the solution to this issue ends up being, I'd be interested to see support for this "paradigm".
This seems like another case that would be well served by an operator ..<+
hack:
infix operator ..<+
extension Strideable {
static func ..<+ (lhs: Self, rhs: Stride) -> Range<Self> {
return lhs..<lhs.advanced(by: rhs)
}
}
let r = 5..<+10
// r = 5..<15
I don't hate ..<+
. Range issues that haven't really been fully addressed:
-
open and closed for both ends: (2, 5) which is 2 ... 5, (2, 5] 2 ..< 5, [2, 5) 2 <.. 5, [2, 5], 2 <.< 5 or 2 <..<5 (Note Swift has leading period rules that are broken here)
-
a better defined conception of
to
,through
, and (I once proposed)towards
to indicate up to but not including, up to and including, up to and including or past -
vector style ranges, with start, magnitude, and direction such as 5 +.. 5 or 5 ..+ 5 or 5 ...+ 5 or 5 ..< +5 or 5 ..< ++5
-
ranges relative to known indices
(startIndex, endIndex]
), allowing 5 ... -5 (which Wux has already prototyped, and is the core discussion of this thread) -
ranges representing a negative vector like
for i in 2 downTo -5
or2 ... --5
or2 ..> --6
or just2 ..> -6
or2 ..< -6
Just throwing those out there.
-- E
The objection is going to be:
5..< +10
which means something else.
I stared at the design issue for a few hours yesterday, and concluded that, once you have ...
or ..<
along with +
and -
for offsetting, and [
and ]
for subscripting, any other symbols render the entire thing inscrutable (and using any unary operators is toxic because of spacing issues).
To make sense, it needs alphanumeric-ish symbols for "start" and "end", as basically placeholders for the operator machinery to work on. So, something like this:
x [start+1..<end-3]
or:
x [startIndex+1..<endindex-3]
is about as complex as you want to get. This could be done by sacrificing such symbols from the global namespace, but that seems like a lousy idea.
What's really needed is context-sensitive macros, where perhaps something along this line might be possible:
x [#start+1..<#end-3]
(except, of course, that no one knows what macros might look like, yet).
If we wanted to raise everything that’s not been addressed yet…
Python’s slice syntax [start:end]
has extended form [start:end:step]
which makes to more akin to stride
. But I think we should keep slicing (continuous intervals), distinct from striding. I.e. do not burden ourselves with striding in this design.
Same goes for the distinction between open, half-open and closed intervals. Swift covers that with stride
. But the collections are built on half-open intervals, startIndex
is in, endIndex
is outside. These semantics also hold for slicing operations defined of Sequence
. I’d therefore argue to exclude this from our design.
I think there are a few points to be considered given the discussion until now:
Debate on implicit vs. explicit relative bounds
Basically whether you can use 1 ..< -2
or need to do .startIndex + 1 ..< .endIndex - 2
. I think that allowing implicit offset anchors, apart that it's a thing that you can't easily guess if you didn't know about it (also from other languages), it also makes it more confusing on the left hand side: even if Index == Int
, 1
is not often the same as .startIndex + 1
, on purpose. With the implicit anchor what would it happen to that distinction?
Using KeyPath
s for anchors
I think the KeyPath
idea is very cool and intuitive. The issue I have with it though is that it might too free: you can't limit it to \.startIndex
and \.endIndex
. How would you compare two KeyPath
anchors then?
Syntax vs common range subscript
The issue here is if the syntax is different enough to expose the potential weight of the offsetting. I'm unsure on this one, and don't mind too much either way. I suppose it depends a lot on the actual syntax of the bounds of it (e.g. implicit or explicit) as that would help signalling it as well.
Hopping on RangeExpression
Ideally that should be true. Though I find it hard to fit implementation-wise, as the ranges already conform where Bound == C.Index
, and without having generic protocols you cannot conform in multiple ways. So you either make the existing ranges work with the anchor type, make new ranges, or change RangeExpression
. I tried the first but the others are to be evaluated as well.
(Yet another) shot at an implementation example, more for testing out the syntax, is here. Of course this is highly inspired by the others that were posted in the thread.
struct SliceBound<Bound> where Bound: Comparable {
// `indirect` as a workaround for "cyclic metadata dependency detected, aborting"
// Cannot use `KeyPath`-based approach because it's not limited to start/end, and needs a reference collection
indirect enum Base {
case start
case end
case absolute(Bound)
}
let base: Base
let offset: Int
init(base: Base, offset: Int) {
precondition({ if case .start = base { return false } else { return true } }() || offset >= 0)
precondition({ if case .end = base { return false } else { return true } }() || offset <= 0)
self.base = base
self.offset = offset
}
static var start: SliceBound {
return .init(base: .start, offset: 0)
}
static var end: SliceBound {
return .init(base: .end, offset: 0)
}
}
func + <Bound> (_ base: SliceBound<Bound>.Base, _ offset: Int) -> SliceBound<Bound> {
return .init(base: base, offset: offset)
}
func - <Bound> (_ base: SliceBound<Bound>.Base, _ offset: Int) -> SliceBound<Bound> {
return base + -offset
}
func + <Bound> (_ base: Bound, _ offset: Int) -> SliceBound<Bound> {
return .init(base: .absolute(base), offset: offset)
}
func - <Bound> (_ base: Bound, _ offset: Int) -> SliceBound<Bound> {
return base + -offset
}
extension SliceBound: Equatable {
static func == (_ lhs: SliceBound, _ rhs: SliceBound) -> Bool {
switch (lhs.base, rhs.base) {
case (.start, .start),
(.end, .end):
return lhs.offset == rhs.offset
case let (.absolute(lBase), .absolute(rBase)):
// Can be specialised `where Bound: _Strideable`
return lBase == rBase && lhs.offset == rhs.offset
default:
return false
}
}
}
extension SliceBound: Comparable {
static func < (_ lhs: SliceBound, _ rhs: SliceBound) -> Bool {
switch (lhs.base, rhs.base) {
case (.start, .absolute),
(.start, .end),
(.absolute, .end):
return true
case (.absolute, .start),
(.end, .start),
(.end, .absolute):
return false
case (.start, .start),
(.end, .end):
return lhs.offset < rhs.offset
case let (.absolute(lBase), .absolute(rBase)):
// Can be specialised `where Bound: _Strideable`
return (lBase, lhs.offset) < (rBase, rhs.offset)
}
}
}
extension SliceBound {
func relative <C> (to collection: C) -> C.Index where C: Collection, C.Index == Bound, C.IndexDistance == Int {
let index: C.Index = { switch base {
case .start:
return collection.startIndex
case .absolute(let bound):
return bound
case .end:
return collection.endIndex
} }()
return collection.index(index, offsetBy: offset)
}
}
// Cannot implement `RangeExpression`. Ranges already conform when `Bound == Index`
extension Collection where IndexDistance == Int {
subscript(relative bounds: Range<SliceBound<Index>>) -> SubSequence {
let range = bounds.lowerBound.relative(to: self) ..< bounds.upperBound.relative(to: self)
return self[range.relative(to: self)]
}
subscript(relative bounds: PartialRangeFrom<SliceBound<Index>>) -> SubSequence {
let range = bounds.lowerBound.relative(to: self)...
return self[range.relative(to: self)]
}
// TODO: Implement others
}
extension Comparable {
static func ..< (_ lhs: Self, _ rhs: SliceBound<Self>) -> Range<SliceBound<Self>> {
return (lhs + 0) ..< rhs
}
// TODO: Implement others
}
let c = 1 ... 5
let i = c.index(after: c.startIndex)
c[relative: i ..< (.end - 2)] // [2, 3]
c[relative: (.start + 1) ..< (.end - 2)] // [2, 3]
let arr = Array(c)
arr[relative: 1 ..< (.end - 2)] // [2, 3]
arr[relative: (.start + 1) ..< (.end - 2)] // [2, 3]
let slice = arr.dropFirst()
slice[relative: 1 ..< (.end - 2)] // [2, 3]
slice[relative: (.start + 1) ..< (.end - 2)] // [3]
slice[relative: (.end - 2)...] // [4, 5]
What about ..+< instead (two dots; the forum is converting this to 3?)
5..+<10
Don't love it, but...
It sounds like we are really trying to create an IndexExpression concept (which would interface with RangeExpression). Why not just do that explicitly?
This could either be implemented by static methods on the IndexExpression protocol (e.g. .start
, .start(plus: 3)
) or some sort of prefix/postfix operator (strawman: |..
and ..|
). You could even just define operators between Int and an existing index expression (e.g. .start + 2
). The resulting type would be able to return an index when given a concrete Collection.
The advantage is that you could then still use these in places that take an index instead of a range...
c[.end] //c.last!
c[.end - 2] //c[c.index(c.endIndex, offsetBy: -2)]
and of course the range stuff would work as well:
c[(.start + 2)...(.end - 3)]
c[(.end - 7)...(.end - 4)]
One other nice property is that you could actually store it in a variable:
var s:IndexExpression<IndexType> = .start + 2
c[s...(s+7)]
I think you're right, that an IndexExpression
protocol would make more sense than trying to shoehorn these concepts into RangeExpression
. RangeExpression
could then be a refinement of IndexExpression
, so that IndexExpression
only has the relative(to:)
operation for reifying the expression on a collection, and RangeExpression
has the contains
behavior for pattern matching. We could then generalize the APIs that currently take RangeExpression
s for collections to accept any IndexExpression
.
.end
seems like a hard sell because it's very different to endIndex
(which is the index past the last element). I'm not sure what the solution is here, though. .first
and .last
could possibly be confused with the properties when you start adding and subtracting numbers, especially when the collection contains integers.
I think for index offseting operation the most meaningful operator should be "<<" and ">>", the code will look like this:
c[.end] //c.last!
c[.end << 2] //c[c.index(c.endIndex, offsetBy: -2)]
c[(.start >> 2)...(.end << 3)]
c[(.end << 7)...(.end << 4)]
This looks pretty nice!
I prefer + and - over >> and << ...
I think for index offseting operation: the code will look like this:
c[.end] //c.last!
c[.end - 2] //c[c.index(c.endIndex, offsetBy: -2)]
c[(.start + 2)…(.end - 3)]
c[(.end - 7)…(.end - 4)]