[pitch] Eliminate Collection.IndexDistance associated type

Hi Swift Evolution,

A pitch for review, aimed at simplifying generic Collection algorithms.

Online copy here:

https://github.com/airspeedswift/swift-evolution/blob/5d1ffda2e83f5b95a88d5ce3948c5fd0d59622f4/proposals/NNNN-eliminate-indexdistance.md

# Eliminate `IndexDistance` from `Collection`

* Proposal: [SE-NNNN](NNNN-eliminate-indexdistance.md)
* Authors: [Ben Cohen](https://github.com/airspeedswift\)
* Review Manager: TBD
* Status: **Awaiting review**
* Implementation: [apple/swift#12641](https://github.com/apple/swift/pull/12641\)

## Introduction

Eliminate the associated type `IndexDistance` from `Collection`, and modify all uses to the concrete type `Int` instead.

## Motivation

`Collection` allows for the distance between two indices to be any `SignedInteger` type via the `IndexDistance` associated type. While in practice the distance between indices is almost always
an `Int`, generic algorithms on `Collection` need to either constrain `IndexDistance == Int`, or write their algorithm to be generic over any `SignedInteger`.

Swift 4.0 introduced the ability to constrain associated types with `where` clauses
([SE-142](https://github.com/apple/swift-evolution/blob/master/proposals/0142-associated-types-constraints.md\)) and will soon allow protocol constraints
to be recursive ([SE-157](https://github.com/apple/swift-evolution/blob/master/proposals/0157-recursive-protocol-constraints.md\)). With these features,
writing generic algorithms against `Collection` is finally a realistic tool for intermediate Swift programmers. You no longer need to know to
constrain `SubSequence.Element == Element` or `SubSequence: Collection`, missing constraints that previously led to inexplicable error messages.

At this point, the presence of `IndexDistance` is of of the biggest hurdles that new users trying to write generic algorithms face. If you want to
write code that will compile against any distance type, you need to constantly juggle with explicit type annotations (i.e. you need to write `let i:
IndexDistance = 0` instead of just `let i = 0`), and perform `numericCast` to convert from one distance type to another.

But these `numericCasts` are hard to use correctly. Given two collections with different index distances, it's very hard to reason about whether your
`numericCast` is casting from the smaller to larger type correctly. This turns any problem of writing a generic collection algorithm into both a collection _and_
problem. And chances are you are going to need to interoperate with a method that takes or provides a concrete `Int` anyway (like `Array.reserveCapacity` inside
`Collection.map`). Much of the generic code in the standard library would trap if ever presented with a collection with a distance greater than `Int.max`.
Additionally, this generalization makes specialization less likely and increases compile-time work.

For these reasons, it's common to see algorithms constrained to `IndexDistance == Int`. In fact, the inconvenience of having to deal with generic index
distances probably encourages more algorithms to be constrained to `Index == Int`, such as [this
code](swift-package-manager/OutputByteStream.swift at 472c647dcad3adf4344a06ef7ba91d2d4abddc94 · airspeedswift/swift-package-manager · GitHub) in
the Swift Package Manager. Converting this function to work with any index type would be straightforward. Converting it to work with any index distance
as well would be much trickier.

The general advice from [The Swift Programming
Language](The Basics — The Swift Programming Language (Swift 5.7)) when writing Swift code is to encourage users to stick to using `Int` unless they have a special reason not to:

Unless you need to work with a specific size of integer, always use `Int` for integer values in your code. [...] `Int` is preferred, even when the values to be stored are known to be nonnegative. A consistent use of Int for integer values aids

code interoperability, avoids the need to convert between different number types, and matches integer type inference[.]

There are two main use cases for keeping `IndexDistance` as an associated type rather than concretizing it to be `Int`: tiny collections that might
benefit from tiny distances, and huge collections that need to address greater than `Int.max` elements. For example, it may seem wasteful to force a
type that presents the bits in a `UInt` as a collection to need to use a whole `Int` for its distance type. Or you may want to create a gigantic
collection, such as one backed by a memory mapped file, with a size great than `Int.max`. The most likely scenario for this is on 32-bit processors where a collection would be constrained to 2 billion elements.

These use cases are very niche, and do not seem to justify the considerable impedance to generic programming that `IndexDistance` causes. Therefore,
this proposal recommends removing the associated type and replacing all references to it with `Int`.

## Proposed solution

Scrap the `IndexDistance` associated type. Switch all references to it in the standard library to the concrete `Int` type:

protocol Collection {
	var count: Int { get }
	func index(_ i: Index, offsetBy n: Int) -> Index
	func index(_ i: Index, offsetBy n: Int, limitedBy limit: Index) -> Index?
	func distance(from start: Index, to end: Index) -> Int
}
// and in numerous extensions in the standard library

The one instance where a concrete type uses an `IndexDistance` other than `Int` in the standard library is `AnyCollection`, which uses `Int64`. This would be changed to `Int`.

## Source compatibility

This can be split into 2 parts:

Algorithms that currently constrain `IndexDistance` to `Int` in their `where` clause, and algorithms that use `IndexDistance` within the body of a
method, can be catered for by a deprecated typealias for `IndexDistance` inside an extension on `Collection`. This is the common case.

Collections that truly take advantage of the ability to define non-`Int` distances would be source-broken, with no practical way of making this
compatible in a 4.0 mode. It's worth noting that there are no such types in the Swift source compatibility suite.

## Effect on ABI stability

This removes an associated type and changes function signatures, so must be done before declaring ABI stability

## Alternatives considered

None other than status quo.

Gah, typo. That should read “This turns any problem of writing a generic collection algorithm into both a collection _and_ a numeric problem.”

Apologies to Nate Cook, from whom this important came.

···

On Nov 8, 2017, at 1:37 PM, Ben Cohen via swift-evolution <swift-evolution@swift.org> wrote:

This turns any problem of writing a generic collection algorithm into both a collection _and_
problem.

I guess I am mildly +1. I can’t think of a case where this wouldn’t be some sort of Integer, but I am also open to arguments against. I am not sure what the utility of having a type for this besides Int is...

Thanks,
Jon

···

On Nov 8, 2017, at 1:37 PM, Ben Cohen via swift-evolution <swift-evolution@swift.org> wrote:

Hi Swift Evolution,

A pitch for review, aimed at simplifying generic Collection algorithms.

Online copy here:

https://github.com/airspeedswift/swift-evolution/blob/5d1ffda2e83f5b95a88d5ce3948c5fd0d59622f4/proposals/NNNN-eliminate-indexdistance.md

# Eliminate `IndexDistance` from `Collection`

* Proposal: [SE-NNNN](NNNN-eliminate-indexdistance.md)
* Authors: [Ben Cohen](https://github.com/airspeedswift\)
* Review Manager: TBD
* Status: **Awaiting review**
* Implementation: [apple/swift#12641](https://github.com/apple/swift/pull/12641\)

## Introduction

Eliminate the associated type `IndexDistance` from `Collection`, and modify all uses to the concrete type `Int` instead.

## Motivation

`Collection` allows for the distance between two indices to be any `SignedInteger` type via the `IndexDistance` associated type. While in practice the distance between indices is almost always
an `Int`, generic algorithms on `Collection` need to either constrain `IndexDistance == Int`, or write their algorithm to be generic over any `SignedInteger`.

Swift 4.0 introduced the ability to constrain associated types with `where` clauses
([SE-142](https://github.com/apple/swift-evolution/blob/master/proposals/0142-associated-types-constraints.md\)) and will soon allow protocol constraints
to be recursive ([SE-157](https://github.com/apple/swift-evolution/blob/master/proposals/0157-recursive-protocol-constraints.md\)). With these features,
writing generic algorithms against `Collection` is finally a realistic tool for intermediate Swift programmers. You no longer need to know to
constrain `SubSequence.Element == Element` or `SubSequence: Collection`, missing constraints that previously led to inexplicable error messages.

At this point, the presence of `IndexDistance` is of of the biggest hurdles that new users trying to write generic algorithms face. If you want to
write code that will compile against any distance type, you need to constantly juggle with explicit type annotations (i.e. you need to write `let i:
IndexDistance = 0` instead of just `let i = 0`), and perform `numericCast` to convert from one distance type to another.

But these `numericCasts` are hard to use correctly. Given two collections with different index distances, it's very hard to reason about whether your
`numericCast` is casting from the smaller to larger type correctly. This turns any problem of writing a generic collection algorithm into both a collection _and_
problem. And chances are you are going to need to interoperate with a method that takes or provides a concrete `Int` anyway (like `Array.reserveCapacity` inside
`Collection.map`). Much of the generic code in the standard library would trap if ever presented with a collection with a distance greater than `Int.max`.
Additionally, this generalization makes specialization less likely and increases compile-time work.

For these reasons, it's common to see algorithms constrained to `IndexDistance == Int`. In fact, the inconvenience of having to deal with generic index
distances probably encourages more algorithms to be constrained to `Index == Int`, such as [this
code](swift-package-manager/OutputByteStream.swift at 472c647dcad3adf4344a06ef7ba91d2d4abddc94 · airspeedswift/swift-package-manager · GitHub) in
the Swift Package Manager. Converting this function to work with any index type would be straightforward. Converting it to work with any index distance
as well would be much trickier.

The general advice from [The Swift Programming
Language](The Basics — The Swift Programming Language (Swift 5.7)) when writing Swift code is to encourage users to stick to using `Int` unless they have a special reason not to:

> Unless you need to work with a specific size of integer, always use `Int` for integer values in your code. [...] `Int` is preferred, even when the values to be stored are known to be nonnegative. A consistent use of Int for integer values aids
code interoperability, avoids the need to convert between different number types, and matches integer type inference[.]

There are two main use cases for keeping `IndexDistance` as an associated type rather than concretizing it to be `Int`: tiny collections that might
benefit from tiny distances, and huge collections that need to address greater than `Int.max` elements. For example, it may seem wasteful to force a
type that presents the bits in a `UInt` as a collection to need to use a whole `Int` for its distance type. Or you may want to create a gigantic
collection, such as one backed by a memory mapped file, with a size great than `Int.max`. The most likely scenario for this is on 32-bit processors where a collection would be constrained to 2 billion elements.

These use cases are very niche, and do not seem to justify the considerable impedance to generic programming that `IndexDistance` causes. Therefore,
this proposal recommends removing the associated type and replacing all references to it with `Int`.

## Proposed solution

Scrap the `IndexDistance` associated type. Switch all references to it in the standard library to the concrete `Int` type:

protocol Collection {
	var count: Int { get }
	func index(_ i: Index, offsetBy n: Int) -> Index
	func index(_ i: Index, offsetBy n: Int, limitedBy limit: Index) -> Index?
	func distance(from start: Index, to end: Index) -> Int
}
// and in numerous extensions in the standard library

The one instance where a concrete type uses an `IndexDistance` other than `Int` in the standard library is `AnyCollection`, which uses `Int64`. This would be changed to `Int`.

## Source compatibility

This can be split into 2 parts:

Algorithms that currently constrain `IndexDistance` to `Int` in their `where` clause, and algorithms that use `IndexDistance` within the body of a
method, can be catered for by a deprecated typealias for `IndexDistance` inside an extension on `Collection`. This is the common case.

Collections that truly take advantage of the ability to define non-`Int` distances would be source-broken, with no practical way of making this
compatible in a 4.0 mode. It's worth noting that there are no such types in the Swift source compatibility suite.

## Effect on ABI stability

This removes an associated type and changes function signatures, so must be done before declaring ABI stability

## Alternatives considered

None other than status quo.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

+1 it seems like a generic type just for the sake of it. Many languages do without a specific type for distance and use Int and it is a better solution because it is simpler.

-- Howard.

···

On 9 Nov 2017, at 8:37 am, Ben Cohen via swift-evolution <swift-evolution@swift.org> wrote:

Hi Swift Evolution,

A pitch for review, aimed at simplifying generic Collection algorithms.

Online copy here:

https://github.com/airspeedswift/swift-evolution/blob/5d1ffda2e83f5b95a88d5ce3948c5fd0d59622f4/proposals/NNNN-eliminate-indexdistance.md

# Eliminate `IndexDistance` from `Collection`

* Proposal: [SE-NNNN](NNNN-eliminate-indexdistance.md)
* Authors: [Ben Cohen](https://github.com/airspeedswift\)
* Review Manager: TBD
* Status: **Awaiting review**
* Implementation: [apple/swift#12641](https://github.com/apple/swift/pull/12641\)

## Introduction

Eliminate the associated type `IndexDistance` from `Collection`, and modify all uses to the concrete type `Int` instead.

## Motivation

`Collection` allows for the distance between two indices to be any `SignedInteger` type via the `IndexDistance` associated type. While in practice the distance between indices is almost always
an `Int`, generic algorithms on `Collection` need to either constrain `IndexDistance == Int`, or write their algorithm to be generic over any `SignedInteger`.

Swift 4.0 introduced the ability to constrain associated types with `where` clauses
([SE-142](https://github.com/apple/swift-evolution/blob/master/proposals/0142-associated-types-constraints.md\)) and will soon allow protocol constraints
to be recursive ([SE-157](https://github.com/apple/swift-evolution/blob/master/proposals/0157-recursive-protocol-constraints.md\)). With these features,
writing generic algorithms against `Collection` is finally a realistic tool for intermediate Swift programmers. You no longer need to know to
constrain `SubSequence.Element == Element` or `SubSequence: Collection`, missing constraints that previously led to inexplicable error messages.

At this point, the presence of `IndexDistance` is of of the biggest hurdles that new users trying to write generic algorithms face. If you want to
write code that will compile against any distance type, you need to constantly juggle with explicit type annotations (i.e. you need to write `let i:
IndexDistance = 0` instead of just `let i = 0`), and perform `numericCast` to convert from one distance type to another.

But these `numericCasts` are hard to use correctly. Given two collections with different index distances, it's very hard to reason about whether your
`numericCast` is casting from the smaller to larger type correctly. This turns any problem of writing a generic collection algorithm into both a collection _and_
problem. And chances are you are going to need to interoperate with a method that takes or provides a concrete `Int` anyway (like `Array.reserveCapacity` inside
`Collection.map`). Much of the generic code in the standard library would trap if ever presented with a collection with a distance greater than `Int.max`.
Additionally, this generalization makes specialization less likely and increases compile-time work.

For these reasons, it's common to see algorithms constrained to `IndexDistance == Int`. In fact, the inconvenience of having to deal with generic index
distances probably encourages more algorithms to be constrained to `Index == Int`, such as [this
code](swift-package-manager/OutputByteStream.swift at 472c647dcad3adf4344a06ef7ba91d2d4abddc94 · airspeedswift/swift-package-manager · GitHub) in
the Swift Package Manager. Converting this function to work with any index type would be straightforward. Converting it to work with any index distance
as well would be much trickier.

The general advice from [The Swift Programming
Language](The Basics — The Swift Programming Language (Swift 5.7)) when writing Swift code is to encourage users to stick to using `Int` unless they have a special reason not to:

> Unless you need to work with a specific size of integer, always use `Int` for integer values in your code. [...] `Int` is preferred, even when the values to be stored are known to be nonnegative. A consistent use of Int for integer values aids
code interoperability, avoids the need to convert between different number types, and matches integer type inference[.]

There are two main use cases for keeping `IndexDistance` as an associated type rather than concretizing it to be `Int`: tiny collections that might
benefit from tiny distances, and huge collections that need to address greater than `Int.max` elements. For example, it may seem wasteful to force a
type that presents the bits in a `UInt` as a collection to need to use a whole `Int` for its distance type. Or you may want to create a gigantic
collection, such as one backed by a memory mapped file, with a size great than `Int.max`. The most likely scenario for this is on 32-bit processors where a collection would be constrained to 2 billion elements.

These use cases are very niche, and do not seem to justify the considerable impedance to generic programming that `IndexDistance` causes. Therefore,
this proposal recommends removing the associated type and replacing all references to it with `Int`.

## Proposed solution

Scrap the `IndexDistance` associated type. Switch all references to it in the standard library to the concrete `Int` type:

protocol Collection {
	var count: Int { get }
	func index(_ i: Index, offsetBy n: Int) -> Index
	func index(_ i: Index, offsetBy n: Int, limitedBy limit: Index) -> Index?
	func distance(from start: Index, to end: Index) -> Int
}
// and in numerous extensions in the standard library

The one instance where a concrete type uses an `IndexDistance` other than `Int` in the standard library is `AnyCollection`, which uses `Int64`. This would be changed to `Int`.

## Source compatibility

This can be split into 2 parts:

Algorithms that currently constrain `IndexDistance` to `Int` in their `where` clause, and algorithms that use `IndexDistance` within the body of a
method, can be catered for by a deprecated typealias for `IndexDistance` inside an extension on `Collection`. This is the common case.

Collections that truly take advantage of the ability to define non-`Int` distances would be source-broken, with no practical way of making this
compatible in a 4.0 mode. It's worth noting that there are no such types in the Swift source compatibility suite.

## Effect on ABI stability

This removes an associated type and changes function signatures, so must be done before declaring ABI stability

## Alternatives considered

None other than status quo.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I would love to pretend my accidentally a word there was a joke. But it wasn’t…

···

On Nov 8, 2017, at 1:55 PM, Ben Cohen <ben_cohen@apple.com> wrote:

Apologies to Nate Cook, from whom this important came.

Although there are historical architectures where the addressable space
exceeds the word size, IIUC, we'd be saying that collections don't support
more elements than there are addressable units in memory. Are there very
many use cases where a collection needs to have more elements than that? I
really don't think it's terribly common, and in any case it'd break a bunch
of the collection APIs (like `map`).

···

On Wed, Nov 8, 2017 at 4:19 PM, Jonathan Hull via swift-evolution < swift-evolution@swift.org> wrote:

I guess I am mildly +1. I can’t think of a case where this wouldn’t be
some sort of Integer, but I am also open to arguments against. I am not
sure what the utility of having a type for this besides Int is...

Thanks,
Jon

On Nov 8, 2017, at 1:37 PM, Ben Cohen via swift-evolution < > swift-evolution@swift.org> wrote:

Hi Swift Evolution,

A pitch for review, aimed at simplifying generic Collection algorithms.

Online copy here:

https://github.com/airspeedswift/swift-evolution/blob/
5d1ffda2e83f5b95a88d5ce3948c5fd0d59622f4/proposals/NNNN-
eliminate-indexdistance.md

# Eliminate `IndexDistance` from `Collection`

* Proposal: [SE-NNNN](NNNN-eliminate-indexdistance.md)
* Authors: [Ben Cohen](https://github.com/airspeedswift\)
* Review Manager: TBD
* Status: **Awaiting review**
* Implementation: [apple/swift#12641](https://github.com/apple/swift/pull/12641\)

## Introduction

Eliminate the associated type `IndexDistance` from `Collection`, and modify all uses to the concrete type `Int` instead.

## Motivation

`Collection` allows for the distance between two indices to be any `SignedInteger` type via the `IndexDistance` associated type. While in practice the distance between indices is almost always
an `Int`, generic algorithms on `Collection` need to either constrain `IndexDistance == Int`, or write their algorithm to be generic over any `SignedInteger`.

Swift 4.0 introduced the ability to constrain associated types with `where` clauses
([SE-142](https://github.com/apple/swift-evolution/blob/master/proposals/0142-associated-types-constraints.md\)) and will soon allow protocol constraints
to be recursive ([SE-157](https://github.com/apple/swift-evolution/blob/master/proposals/0157-recursive-protocol-constraints.md\)). With these features,
writing generic algorithms against `Collection` is finally a realistic tool for intermediate Swift programmers. You no longer need to know to
constrain `SubSequence.Element == Element` or `SubSequence: Collection`, missing constraints that previously led to inexplicable error messages.

At this point, the presence of `IndexDistance` is of of the biggest hurdles that new users trying to write generic algorithms face. If you want to
write code that will compile against any distance type, you need to constantly juggle with explicit type annotations (i.e. you need to write `let i:
IndexDistance = 0` instead of just `let i = 0`), and perform `numericCast` to convert from one distance type to another.

But these `numericCasts` are hard to use correctly. Given two collections with different index distances, it's very hard to reason about whether your
`numericCast` is casting from the smaller to larger type correctly. This turns any problem of writing a generic collection algorithm into both a collection _and_
problem. And chances are you are going to need to interoperate with a method that takes or provides a concrete `Int` anyway (like `Array.reserveCapacity` inside
`Collection.map`). Much of the generic code in the standard library would trap if ever presented with a collection with a distance greater than `Int.max`.
Additionally, this generalization makes specialization less likely and increases compile-time work.

For these reasons, it's common to see algorithms constrained to `IndexDistance == Int`. In fact, the inconvenience of having to deal with generic index
distances probably encourages more algorithms to be constrained to `Index == Int`, such as [this
code](swift-package-manager/OutputByteStream.swift at 472c647dcad3adf4344a06ef7ba91d2d4abddc94 · airspeedswift/swift-package-manager · GitHub) in
the Swift Package Manager. Converting this function to work with any index type would be straightforward. Converting it to work with any index distance
as well would be much trickier.

The general advice from [The Swift Programming
Language](The Basics — The Swift Programming Language (Swift 5.7)) when writing Swift code is to encourage users to stick to using `Int` unless they have a special reason not to:

> Unless you need to work with a specific size of integer, always use `Int` for integer values in your code. [...] `Int` is preferred, even when the values to be stored are known to be nonnegative. A consistent use of Int for integer values aids
code interoperability, avoids the need to convert between different number types, and matches integer type inference[.]

There are two main use cases for keeping `IndexDistance` as an associated type rather than concretizing it to be `Int`: tiny collections that might
benefit from tiny distances, and huge collections that need to address greater than `Int.max` elements. For example, it may seem wasteful to force a
type that presents the bits in a `UInt` as a collection to need to use a whole `Int` for its distance type. Or you may want to create a gigantic
collection, such as one backed by a memory mapped file, with a size great than `Int.max`. The most likely scenario for this is on 32-bit processors where a collection would be constrained to 2 billion elements.

These use cases are very niche, and do not seem to justify the considerable impedance to generic programming that `IndexDistance` causes. Therefore,
this proposal recommends removing the associated type and replacing all references to it with `Int`.

## Proposed solution

Scrap the `IndexDistance` associated type. Switch all references to it in the standard library to the concrete `Int` type:

protocol Collection {
	var count: Int { get }
	func index(_ i: Index, offsetBy n: Int) -> Index
	func index(_ i: Index, offsetBy n: Int, limitedBy limit: Index) -> Index?
	func distance(from start: Index, to end: Index) -> Int
}
// and in numerous extensions in the standard library

The one instance where a concrete type uses an `IndexDistance` other than `Int` in the standard library is `AnyCollection`, which uses `Int64`. This would be changed to `Int`.

## Source compatibility

This can be split into 2 parts:

Algorithms that currently constrain `IndexDistance` to `Int` in their `where` clause, and algorithms that use `IndexDistance` within the body of a
method, can be catered for by a deprecated typealias for `IndexDistance` inside an extension on `Collection`. This is the common case.

Collections that truly take advantage of the ability to define non-`Int` distances would be source-broken, with no practical way of making this
compatible in a 4.0 mode. It's worth noting that there are no such types in the Swift source compatibility suite.

## Effect on ABI stability

This removes an associated type and changes function signatures, so must be done before declaring ABI stability

## Alternatives considered

None other than status quo.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution