Shorthand for Offsetting startIndex and endIndex

Letan · March 2, 2018, 12:59am

Same thing happened to me. In this very same thread!

QuinceyMorris · March 2, 2018, 11:41pm

FWIW, it occurred to me that there's a general problem in implementations like your. For Collection, you cannot index relative to .endIndex:

https://developer.apple.com/documentation/swift/collection/2949866-index

The offsetBy: parameter cannot be negative unless it's a BidirectionalCollection.

palimondo · March 3, 2018, 9:44pm

As I’ve shown above, you can implement it on top of sequence operations. That only means, the slicing relative to endIndex has different performance characteristics depending on the underlying Collection. There is no problem.

We have to make all this work on Sequence too! The slicing is not only for Collections.

QuinceyMorris · March 3, 2018, 10:14pm

Indeed, but it's not obvious that the Sequence solution should be a subscript rather than function.

I'm still playing around, looking for a more obvious syntax that might make everyone happy without being too obscure. I'm wondering if the fault isn't in trying to use the range operators, which look terrible when juxtaposed with syntax that means offsetting. Perhaps a two-parameter subscript would be clearer:

  c [offset-expression, offset-expression]
  // for example:
  c [2, -3] // meaning of '-3' still obscure
  // or even:
  c [start: 2, end: -3]
  c [start: 2, start: 4]
  c [end: -5, end: 0]

or something along that line.

fswarbrick · March 3, 2018, 10:45pm

Is this a typo or does it mean "from start+2 to start+4" ?

QuinceyMorris · March 3, 2018, 10:47pm

It was intended to mean “from start+2 to start+4”. I know it's not good, but we haven't found good yet.

fswarbrick · March 3, 2018, 10:52pm

Just checking.

Side note, one issue I have with any Swift ranges is that they are "start/end" based with no option for "start/length". For example, a language which shall not be named supports the following:
s(x : l)
where x is the starting position and l is the length of the "range". Whatever the solution to this issue ends up being, I'd be interested to see support for this "paradigm".

Ben_Cohen · March 4, 2018, 9:48pm

This seems like another case that would be well served by an operator ..<+ hack:

infix operator ..<+

extension Strideable {
  static func ..<+ (lhs: Self, rhs: Stride) -> Range<Self> {
    return lhs..<lhs.advanced(by: rhs)
  }
}

let r = 5..<+10
// r = 5..<15

Erica_Sadun · March 4, 2018, 10:32pm

I don't hate ..<+. Range issues that haven't really been fully addressed:

open and closed for both ends: (2, 5) which is 2 ... 5, (2, 5] 2 ..< 5, [2, 5) 2 <.. 5, [2, 5], 2 <.< 5 or 2 <..<5 (Note Swift has leading period rules that are broken here)
a better defined conception of to, through, and (I once proposed) towards to indicate up to but not including, up to and including, up to and including or past
vector style ranges, with start, magnitude, and direction such as 5 +.. 5 or 5 ..+ 5 or 5 ...+ 5 or 5 ..< +5 or 5 ..< ++5
ranges relative to known indices (startIndex, endIndex]), allowing 5 ... -5 (which Wux has already prototyped, and is the core discussion of this thread)
ranges representing a negative vector like for i in 2 downTo -5 or 2 ... --5 or 2 ..> --6 or just 2 ..> -6 or 2 ..< -6

Just throwing those out there.

-- E

QuinceyMorris · March 4, 2018, 11:22pm

The objection is going to be:

    5..< +10

which means something else.

I stared at the design issue for a few hours yesterday, and concluded that, once you have ... or ..< along with + and - for offsetting, and [ and ] for subscripting, any other symbols render the entire thing inscrutable (and using any unary operators is toxic because of spacing issues).

To make sense, it needs alphanumeric-ish symbols for "start" and "end", as basically placeholders for the operator machinery to work on. So, something like this:

x [start+1..<end-3]

or:

x [startIndex+1..<endindex-3]

is about as complex as you want to get. This could be done by sacrificing such symbols from the global namespace, but that seems like a lousy idea.

What's really needed is context-sensitive macros, where perhaps something along this line might be possible:

x [#start+1..<#end-3]

(except, of course, that no one knows what macros might look like, yet).

palimondo · March 5, 2018, 7:19am

If we wanted to raise everything that’s not been addressed yet…

Python’s slice syntax [start:end] has extended form [start:end:step] which makes to more akin to stride. But I think we should keep slicing (continuous intervals), distinct from striding. I.e. do not burden ourselves with striding in this design.

Same goes for the distinction between open, half-open and closed intervals. Swift covers that with stride. But the collections are built on half-open intervals, startIndex is in, endIndex is outside. These semantics also hold for slicing operations defined of Sequence. I’d therefore argue to exclude this from our design.

DeFrenZ · March 5, 2018, 10:10am

I think there are a few points to be considered given the discussion until now:

Debate on implicit vs. explicit relative bounds

Basically whether you can use 1 ..< -2 or need to do .startIndex + 1 ..< .endIndex - 2. I think that allowing implicit offset anchors, apart that it's a thing that you can't easily guess if you didn't know about it (also from other languages), it also makes it more confusing on the left hand side: even if Index == Int, 1 is not often the same as .startIndex + 1, on purpose. With the implicit anchor what would it happen to that distinction?

Using `KeyPath`s for anchors

I think the KeyPath idea is very cool and intuitive. The issue I have with it though is that it might too free: you can't limit it to \.startIndex and \.endIndex. How would you compare two KeyPath anchors then?

Syntax vs common range subscript

The issue here is if the syntax is different enough to expose the potential weight of the offsetting. I'm unsure on this one, and don't mind too much either way. I suppose it depends a lot on the actual syntax of the bounds of it (e.g. implicit or explicit) as that would help signalling it as well.

Hopping on `RangeExpression`

Ideally that should be true. Though I find it hard to fit implementation-wise, as the ranges already conform where Bound == C.Index, and without having generic protocols you cannot conform in multiple ways. So you either make the existing ranges work with the anchor type, make new ranges, or change RangeExpression. I tried the first but the others are to be evaluated as well.

(Yet another) shot at an implementation example, more for testing out the syntax, is here. Of course this is highly inspired by the others that were posted in the thread.

struct SliceBound<Bound> where Bound: Comparable {
	// `indirect` as a workaround for "cyclic metadata dependency detected, aborting"
	// Cannot use `KeyPath`-based approach because it's not limited to start/end, and needs a reference collection
	indirect enum Base {
		case start
		case end
		case absolute(Bound)
	}

	let base: Base
	let offset: Int
	init(base: Base, offset: Int) {
		precondition({ if case .start = base { return false } else { return true } }() || offset >= 0)
		precondition({ if case .end = base { return false } else { return true } }() || offset <= 0)
		self.base = base
		self.offset = offset
	}

	static var start: SliceBound {
		return .init(base: .start, offset: 0)
	}
	static var end: SliceBound {
		return .init(base: .end, offset: 0)
	}
}

func + <Bound> (_ base: SliceBound<Bound>.Base, _ offset: Int) -> SliceBound<Bound> {
	return .init(base: base, offset: offset)
}

func - <Bound> (_ base: SliceBound<Bound>.Base, _ offset: Int) -> SliceBound<Bound> {
	return base + -offset
}

func + <Bound> (_ base: Bound, _ offset: Int) -> SliceBound<Bound> {
	return .init(base: .absolute(base), offset: offset)
}

func - <Bound> (_ base: Bound, _ offset: Int) -> SliceBound<Bound> {
	return base + -offset
}

extension SliceBound: Equatable {
	static func == (_ lhs: SliceBound, _ rhs: SliceBound) -> Bool {
		switch (lhs.base, rhs.base) {
		case (.start, .start),
			 (.end, .end):
			return lhs.offset == rhs.offset
		case let (.absolute(lBase), .absolute(rBase)):
			// Can be specialised `where Bound: _Strideable`
			return lBase == rBase && lhs.offset == rhs.offset
		default:
			return false
		}
	}
}

extension SliceBound: Comparable {
	static func < (_ lhs: SliceBound, _ rhs: SliceBound) -> Bool {
		switch (lhs.base, rhs.base) {
		case (.start, .absolute),
			 (.start, .end),
			 (.absolute, .end):
			return true
		case (.absolute, .start),
			 (.end, .start),
			 (.end, .absolute):
			return false
		case (.start, .start),
			 (.end, .end):
			return lhs.offset < rhs.offset
		case let (.absolute(lBase), .absolute(rBase)):
			// Can be specialised `where Bound: _Strideable`
			return (lBase, lhs.offset) < (rBase, rhs.offset)
		}
	}
}

extension SliceBound {
	func relative <C> (to collection: C) -> C.Index where C: Collection, C.Index == Bound, C.IndexDistance == Int {
		let index: C.Index = { switch base {
		case .start:
			return collection.startIndex
		case .absolute(let bound):
			return bound
		case .end:
			return collection.endIndex
		} }()
		return collection.index(index, offsetBy: offset)
	}
}

// Cannot implement `RangeExpression`. Ranges already conform when `Bound == Index`
extension Collection where IndexDistance == Int {
	subscript(relative bounds: Range<SliceBound<Index>>) -> SubSequence {
		let range = bounds.lowerBound.relative(to: self) ..< bounds.upperBound.relative(to: self)
		return self[range.relative(to: self)]
	}
	subscript(relative bounds: PartialRangeFrom<SliceBound<Index>>) -> SubSequence {
		let range = bounds.lowerBound.relative(to: self)...
		return self[range.relative(to: self)]
	}
	// TODO: Implement others
}

extension Comparable {
	static func ..< (_ lhs: Self, _ rhs: SliceBound<Self>) -> Range<SliceBound<Self>> {
		return (lhs + 0) ..< rhs
	}
	// TODO: Implement others
}

let c = 1 ... 5
let i = c.index(after: c.startIndex)
c[relative: i ..< (.end - 2)] // [2, 3]
c[relative: (.start + 1) ..< (.end - 2)] // [2, 3]

let arr = Array(c)
arr[relative: 1 ..< (.end - 2)] // [2, 3]
arr[relative: (.start + 1) ..< (.end - 2)] // [2, 3]

let slice = arr.dropFirst()
slice[relative: 1 ..< (.end - 2)] // [2, 3]
slice[relative: (.start + 1) ..< (.end - 2)] // [3]
slice[relative: (.end - 2)...] // [4, 5]

fswarbrick · March 5, 2018, 11:50pm

What about ..+< instead (two dots; the forum is converting this to 3?)

5..+<10

Don't love it, but...

Jon_Hull · March 7, 2018, 3:23pm

It sounds like we are really trying to create an IndexExpression concept (which would interface with RangeExpression). Why not just do that explicitly?

This could either be implemented by static methods on the IndexExpression protocol (e.g. .start, .start(plus: 3)) or some sort of prefix/postfix operator (strawman: |.. and ..|). You could even just define operators between Int and an existing index expression (e.g. .start + 2). The resulting type would be able to return an index when given a concrete Collection.

The advantage is that you could then still use these in places that take an index instead of a range...

c[.end] //c.last!
c[.end - 2] //c[c.index(c.endIndex, offsetBy: -2)]

and of course the range stuff would work as well:

c[(.start + 2)...(.end - 3)]
c[(.end - 7)...(.end - 4)]

Jon_Hull · March 7, 2018, 3:45pm

One other nice property is that you could actually store it in a variable:

var s:IndexExpression<IndexType> = .start + 2
c[s...(s+7)]

Joe_Groff · March 7, 2018, 4:19pm

I think you're right, that an IndexExpression protocol would make more sense than trying to shoehorn these concepts into RangeExpression. RangeExpression could then be a refinement of IndexExpression, so that IndexExpression only has the relative(to:) operation for reifying the expression on a collection, and RangeExpression has the contains behavior for pattern matching. We could then generalize the APIs that currently take RangeExpressions for collections to accept any IndexExpression.

jawbroken · March 7, 2018, 11:58pm

.end seems like a hard sell because it's very different to endIndex (which is the index past the last element). I'm not sure what the solution is here, though. .first and .last could possibly be confused with the properties when you start adding and subtracting numbers, especially when the collection contains integers.

Chenyungui · March 8, 2018, 8:42am

I think for index offseting operation the most meaningful operator should be "<<" and ">>", the code will look like this:

c[.end] //c.last!
c[.end << 2] //c[c.index(c.endIndex, offsetBy: -2)]
c[(.start >> 2)...(.end << 3)]
c[(.end << 7)...(.end << 4)]

hartbit · March 8, 2018, 9:48am

This looks pretty nice!

ckeithray · March 8, 2018, 3:29pm

I prefer + and - over >> and << ...

I think for index offseting operation: the code will look like this:

c[.end] //c.last!
c[.end - 2] //c[c.index(c.endIndex, offsetBy: -2)]
c[(.start + 2)…(.end - 3)]
c[(.end - 7)…(.end - 4)]

Shorthand for Offsetting startIndex and endIndex

Debate on implicit vs. explicit relative bounds

Using KeyPaths for anchors

Syntax vs common range subscript

Hopping on RangeExpression

Using `KeyPath`s for anchors

Hopping on `RangeExpression`