[Idea] Add an (Index, Element) sequence to CollectionType


(Patrick Pijnappel) #1

-- Introduction

There should be a property on CollectionType that returns a sequence of
(Index, Element) tuples.
Currently enumerate() is often used instead, but it is not well suited to
the task and can lead to bugs.

-- Motivation

Using enumerate() instead of an (Index, Element) sequence has two main
problems.
Both arise because enumerate() returns a sequence of (n, Element) tuples,
where n is the element *number*, instead of a sequence of (Index, Element).

1) It doesn't work for collections not indexed by integers.

2) It doesn't do what you might expect in some cases, as indices do not
always start at 0.
For example ArraySlice's indices do not: array[2..<5] starts with index 2.
Consider the following code to take the 2nd half of the array and remove
all empty elements:

var array = [ "", "a", "b", "c", "", "d" ]
var secondHalf = array[array.count/2..<array.count]
for (index, element) in secondHalf.enumerate() {
  if element == "" {
  secondHalf.removeAtIndex(index)
  }
}

This code will crash (ignoring for a moment this should probably be using
filter).

-- Alternatives

The same effect can already be achieved using the following:

for index in collection.indices {
  let element = collection[index]
  // ...
}

However having a dedicated (Index, Element) sequence has the following
advantages:
a) It can help prevent people from using enumerate() inappropriately.
b) It is very common use case that deserves shortening.
c) It can be chained (e.g. to map).

-- Proposed Solution

Add a property/method on CollectionType that returns a sequence of (Index,
Element) tuples.
For example, using a property named indexed:

for (index, element) in collection.indexed {
  // ...
}

This should be the preferred idiom when you want both the index and the
element.

Note that enumerate() does still have valid roles to play:
- When you actually do want the element number, not the index.
- When you have a SequenceType, as it isn't indexed.

-- Implementation

The feature could be entirely implemented using existing constructs:

extension CollectionType {
  var indexed: AnySequence<(Index, Generator.Element)> {
    return AnySequence(indices.lazy.map { ($0, self[$0]) })
  }
}

Alternatively, a dedicated SequenceType and/or GeneratorType could be added.


Add forEachWithIndex to Array
(Lily Ballard) #2

What you're asking for can already be done with `zip(col.indices, col)`.
And in my experience the need for this sort of thing is rare enough that
there's no need to have a dedicated property for it in the stdlib. The
few times that I've needed this sort of thing, I've always just said

for index in col.indices { let elt = col[index] // ... }

and that's pretty simple. But if I ever did need to map it, I'd just use
the aforementioned zip() expression.

-Kevin Ballard

···

On Sun, Dec 27, 2015, at 12:08 AM, Patrick Pijnappel via swift-evolution wrote:

-- Introduction

There should be a property on CollectionType that returns a sequence
of (Index, Element) tuples. Currently enumerate() is often used
instead, but it is not well suited to the task and can lead to bugs.

-- Motivation

Using enumerate() instead of an (Index, Element) sequence has two main
problems. Both arise because enumerate() returns a sequence of (n,
Element) tuples, where n is the element *number*, instead of a
sequence of (Index, Element).

1) It doesn't work for collections not indexed by integers.

2) It doesn't do what you might expect in some cases, as indices do
   not always start at 0. For example ArraySlice's indices do not:
   array[2..<5] starts with index 2. Consider the following code to
   take the 2nd half of the array and remove all empty elements:

var array = [ "", "a", "b", "c", "", "d" ] var secondHalf =
array[array.count/2..<array.count] for (index, element) in
secondHalf.enumerate() { if element == "" {
secondHalf.removeAtIndex(index) } }

This code will crash (ignoring for a moment this should probably be
using filter).

-- Alternatives

The same effect can already be achieved using the following:

for index in collection.indices { let element = collection[index]
// ... }

However having a dedicated (Index, Element) sequence has the following
advantages:
a) It can help prevent people from using enumerate() inappropriately.
b) It is very common use case that deserves shortening.
c) It can be chained (e.g. to map).

-- Proposed Solution

Add a property/method on CollectionType that returns a sequence of
(Index, Element) tuples. For example, using a property named indexed:

for (index, element) in collection.indexed { // ... }

This should be the preferred idiom when you want both the index and
the element.

Note that enumerate() does still have valid roles to play:
- When you actually do want the element number, not the index.
- When you have a SequenceType, as it isn't indexed.

-- Implementation

The feature could be entirely implemented using existing constructs:

extension CollectionType { var indexed: AnySequence<(Index,
Generator.Element)> { return AnySequence(indices.lazy.map { ($0,
self[$0]) }) } }

Alternatively, a dedicated SequenceType and/or GeneratorType could
be added.

_________________________________________________
swift-evolution mailing list swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Brent Royal-Gordon) #3

What you're asking for can already be done with `zip(col.indices, col)`. And in my experience the need for this sort of thing is rare enough that there's no need to have a dedicated property for it in the stdlib. The few times that I've needed this sort of thing, I've always just said

for index in col.indices {
    let elt = col[index]
    // ...
}

and that's pretty simple. But if I ever did need to map it, I'd just use the aforementioned zip() expression.

I know the zip trick and have used it several times. In one project, I used it frequently enough that I created an extension. Personally, I would like to see a more convenient way to do this, although it's not that strong of a preference.

···

--
Brent Royal-Gordon
Architechies


#4

The original example contains a bug which is present on all looping version/alternative due to the mutating nature of the array. Using the zip implementation:

var array = [ "", "a", "b", "c", "", "d" ]
var secondHalf = array[array.count/2..<array.count]
for (index, element) in zip(secondHalf.indices, secondHalf) {
    if element == "" {
        secondHalf.removeAtIndex(index)
    }
}

The variable index cycles through 3,4 and 5; but in order to be able to remove the right element beyond the first removal, the variable index should have cycled through 3, 4 and 4 (as some elements got shifted after the first mutation). Mutating the array/list which one loops over is a risky business and is a nice source of bugs (including infinite loop). If this (Index, Element) is further investigated, it should consider that one may also want to do a insert(:atIndex:), and may expect the (Index, Element) to have proper Index but only for the original Element.

Dany St-Amant

···

Le 28 déc. 2015 à 01:06, Kevin Ballard via swift-evolution <swift-evolution@swift.org> a écrit :

What you're asking for can already be done with `zip(col.indices, col)`. And in my experience the need for this sort of thing is rare enough that there's no need to have a dedicated property for it in the stdlib. The few times that I've needed this sort of thing, I've always just said

for index in col.indices {
    let elt = col[index]
    // ...
}

and that's pretty simple. But if I ever did need to map it, I'd just use the aforementioned zip() expression.

-Kevin Ballard

On Sun, Dec 27, 2015, at 12:08 AM, Patrick Pijnappel via swift-evolution wrote:

-- Introduction

There should be a property on CollectionType that returns a sequence of (Index, Element) tuples.
Currently enumerate() is often used instead, but it is not well suited to the task and can lead to bugs.

-- Motivation

Using enumerate() instead of an (Index, Element) sequence has two main problems.
Both arise because enumerate() returns a sequence of (n, Element) tuples,
where n is the element *number*, instead of a sequence of (Index, Element).

1) It doesn't work for collections not indexed by integers.

2) It doesn't do what you might expect in some cases, as indices do not always start at 0.
For example ArraySlice's indices do not: array[2..<5] starts with index 2.
Consider the following code to take the 2nd half of the array and remove all empty elements:

var array = [ "", "a", "b", "c", "", "d" ]
var secondHalf = array[array.count/2..<array.count]
for (index, element) in secondHalf.enumerate() {
if element == "" {
secondHalf.removeAtIndex(index)
}
}

This code will crash (ignoring for a moment this should probably be using filter).

-- Alternatives

The same effect can already be achieved using the following:

for index in collection.indices {
  let element = collection[index]
  // ...
}

However having a dedicated (Index, Element) sequence has the following advantages:
a) It can help prevent people from using enumerate() inappropriately.
b) It is very common use case that deserves shortening.
c) It can be chained (e.g. to map).

-- Proposed Solution

Add a property/method on CollectionType that returns a sequence of (Index, Element) tuples.
For example, using a property named indexed:

for (index, element) in collection.indexed {
  // ...
}

This should be the preferred idiom when you want both the index and the element.

Note that enumerate() does still have valid roles to play:
- When you actually do want the element number, not the index.
- When you have a SequenceType, as it isn't indexed.

-- Implementation

The feature could be entirely implemented using existing constructs:

extension CollectionType {
  var indexed: AnySequence<(Index, Generator.Element)> {
    return AnySequence(indices.lazy.map { ($0, self[$0]) })
  }
}

Alternatively, a dedicated SequenceType and/or GeneratorType could be added.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Lily Ballard) #5

What you describe isn't a bug in the implementation, it's a bug in the
expectation that you can mutate an array that you're iterating over.
With any other collection, the mutation would actually invalidate the
index. It so happens that Arrays use Int as their index, so they can't
really be considered invalid, but you should treat them the same way in
cases like this; mutating the array means the index range you're
iterating over is no longer representative of the array.

-Kevin Ballard

···

On Mon, Dec 28, 2015, at 06:30 PM, Dany St-Amant via swift-evolution wrote:

The original example contains a bug which is present on all looping
version/alternative due to the mutating nature of the array. Using the
zip implementation:

var array = [ "", "a", "b", "c", "", "d" ] var secondHalf =
array[array.count/2..<array.count] for (index, element) in
zip(secondHalf.indices, secondHalf) { if element == "" {
secondHalf.removeAtIndex(index) } }

The variable index cycles through 3,4 and 5; but in order to be able
to remove the right element beyond the first removal, the variable
index should have cycled through 3, 4 and 4 (as some elements got
shifted after the first mutation). Mutating the array/list which one
loops over is a risky business and is a nice source of bugs (including
infinite loop). If this (Index, Element) is further investigated, it
should consider that one may also want to do a insert(:atIndex:), and
may expect the (Index, Element) to have proper Index but only for the
original Element.

Dany St-Amant

Le 28 déc. 2015 à 01:06, Kevin Ballard via swift-evolution <swift- >> evolution@swift.org> a écrit :

What you're asking for can already be done with `zip(col.indices,
col)`. And in my experience the need for this sort of thing is rare
enough that there's no need to have a dedicated property for it in
the stdlib. The few times that I've needed this sort of thing, I've
always just said

for index in col.indices { let elt = col[index] // ... }

and that's pretty simple. But if I ever did need to map it, I'd just
use the aforementioned zip() expression.

-Kevin Ballard

On Sun, Dec 27, 2015, at 12:08 AM, Patrick Pijnappel via swift- >> evolution wrote:

-- Introduction

There should be a property on CollectionType that returns a sequence
of (Index, Element) tuples. Currently enumerate() is often used
instead, but it is not well suited to the task and can lead to bugs.

-- Motivation

Using enumerate() instead of an (Index, Element) sequence has two
main problems. Both arise because enumerate() returns a sequence of
(n, Element) tuples, where n is the element *number*, instead of a
sequence of (Index, Element).

1) It doesn't work for collections not indexed by integers.

2) It doesn't do what you might expect in some cases, as indices do
   not always start at 0. For example ArraySlice's indices do not:
   array[2..<5] starts with index 2. Consider the following code to
   take the 2nd half of the array and remove all empty elements:

var array = [ "", "a", "b", "c", "", "d" ] var secondHalf =
array[array.count/2..<array.count] for (index, element) in
secondHalf.enumerate() { if element == "" {
secondHalf.removeAtIndex(index) } }

This code will crash (ignoring for a moment this should probably be
using filter).

-- Alternatives

The same effect can already be achieved using the following:

for index in collection.indices { let element = collection[index]
// ... }

However having a dedicated (Index, Element) sequence has the
following advantages:
a) It can help prevent people from using enumerate()
   inappropriately.
b) It is very common use case that deserves shortening.
c) It can be chained (e.g. to map).

-- Proposed Solution

Add a property/method on CollectionType that returns a sequence of
(Index, Element) tuples. For example, using a property named
indexed:

for (index, element) in collection.indexed { // ... }

This should be the preferred idiom when you want both the index and
the element.

Note that enumerate() does still have valid roles to play:
- When you actually do want the element number, not the index.
- When you have a SequenceType, as it isn't indexed.

-- Implementation

The feature could be entirely implemented using existing constructs:

extension CollectionType { var indexed: AnySequence<(Index,
Generator.Element)> { return AnySequence(indices.lazy.map { ($0,
self[$0]) }) } }

Alternatively, a dedicated SequenceType and/or GeneratorType could
be added.

_________________________________________________
swift-evolution mailing list swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_________________________________________________
swift-evolution mailing list swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Wallacy) #6

FWIW: In cases like this, just remember call ".reverse()" and remove from
the back.

var array = [ "", "a", "b", "c", "", "d" ]
for (index, element) in array.enumerate().reverse() {
    if element == "" {
        array.removeAtIndex(index)
    }
}
print(array) // ["a", "b", "c", "d"]

zip is useful when we have ArraySlice as has been said before:

var array = [ "", "a", "b", "c", "", "d", "", "e", "" , "f", "g", ""]
var secondHalf = array[array.count/2..<array.count]
print(secondHalf) // ["", "e", "", "f", "g", ""]
for (index, element) in zip(secondHalf.indices, secondHalf).reverse() {
    if element == "" {
        secondHalf.removeAtIndex(index)
    }
}

print(secondHalf) // ["e", "f", "g"]

Anyway, it would not be correct to ".enumerate()" returns (Index, Element)
instead of (n, Element)?

I believe that the current behavior was thought when Slices had indices
starting with zero.

···

Em ter, 29 de dez de 2015 às 00:30, Dany St-Amant via swift-evolution < swift-evolution@swift.org> escreveu:

The original example contains a bug which is present on all looping
version/alternative due to the mutating nature of the array. Using the zip
implementation:

var array = [ "", "a", "b", "c", "", "d" ]
var secondHalf = array[array.count/2..<array.count]
for (index, element) in zip(secondHalf.indices, secondHalf) {
    if element == "" {
        secondHalf.removeAtIndex(index)
    }
}

The variable index cycles through 3,4 and 5; but in order to be able to
remove the right element beyond the first removal, the variable index
should have cycled through 3, 4 and 4 (as some elements got shifted after
the first mutation). Mutating the array/list which one loops over is a
risky business and is a nice source of bugs (including infinite loop). If
this (Index, Element) is further investigated, it should consider that one
may also want to do a insert(:atIndex:), and may expect the (Index,
Element) to have proper Index but only for the original Element.

Dany St-Amant

Le 28 déc. 2015 à 01:06, Kevin Ballard via swift-evolution < > swift-evolution@swift.org> a écrit :

What you're asking for can already be done with `zip(col.indices, col)`.
And in my experience the need for this sort of thing is rare enough that
there's no need to have a dedicated property for it in the stdlib. The few
times that I've needed this sort of thing, I've always just said

for index in col.indices {
    let elt = col[index]
    // ...
}

and that's pretty simple. But if I ever did need to map it, I'd just use
the aforementioned zip() expression.

-Kevin Ballard

On Sun, Dec 27, 2015, at 12:08 AM, Patrick Pijnappel via swift-evolution > wrote:

-- Introduction

There should be a property on CollectionType that returns a sequence of
(Index, Element) tuples.
Currently enumerate() is often used instead, but it is not well suited to
the task and can lead to bugs.

-- Motivation

Using enumerate() instead of an (Index, Element) sequence has two main
problems.
Both arise because enumerate() returns a sequence of (n, Element) tuples,
where n is the element *number*, instead of a sequence of (Index, Element)
.

1) It doesn't work for collections not indexed by integers.

2) It doesn't do what you might expect in some cases, as indices do not
always start at 0.
For example ArraySlice's indices do not: array[2..<5] starts with index 2.
Consider the following code to take the 2nd half of the array and remove
all empty elements:

var array = [ "", "a", "b", "c", "", "d" ]
var secondHalf = array[array.count/2..<array.count]
for (index, element) in secondHalf.enumerate() {
if element == "" {
secondHalf.removeAtIndex(index)
}
}

This code will crash (ignoring for a moment this should probably be using
filter).

-- Alternatives

The same effect can already be achieved using the following:

for index in collection.indices {
  let element = collection[index]
  // ...
}

However having a dedicated (Index, Element) sequence has the following
advantages:
a) It can help prevent people from using enumerate() inappropriately.
b) It is very common use case that deserves shortening.
c) It can be chained (e.g. to map).

-- Proposed Solution

Add a property/method on CollectionType that returns a sequence of (Index,
Element) tuples.
For example, using a property named indexed:

for (index, element) in collection.indexed {
  // ...
}

This should be the preferred idiom when you want both the index and the
element.

Note that enumerate() does still have valid roles to play:
- When you actually do want the element number, not the index.
- When you have a SequenceType, as it isn't indexed.

-- Implementation

The feature could be entirely implemented using existing constructs:

extension CollectionType {
  var indexed: AnySequence<(Index, Generator.Element)> {
    return AnySequence(indices.lazy.map { ($0, self[$0]) })
  }
}

Alternatively, a dedicated SequenceType and/or GeneratorType could be
added.

*_______________________________________________*
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(David Waite) #7

Anyway, it would not be correct to ".enumerate()" returns (Index, Element) instead of (n, Element)?

I believe that the current behavior was thought when Slices had indices starting with zero.

The behavior of enumerate is easiest to explain when you give everything a name and lay them all out on the table: In particular, there is a difference between a Counter, an Index, a Key, a Value, and an Element.

Enumerate always works in terms of adding a counter, not an index. It was perhaps better served as a global method, since one cannot really improve its default implementation.

The rest are as follows:

╔════════════╦═════════════════╦═══════════════╦═════════════════╦════════════════════╗
║ Type ║ Index* ║ Key ║ Value ║ Element** ║
╠════════════╬═════════════════╬═══════════════╬═════════════════╬════════════════════╣
║ Array ║ 0-based offset ║ N/A ║ N/A ║ Generic "T" ║
╠════════════╬═════════════════╬═══════════════╬═════════════════╬════════════════════╣
║ ArraySlice ║ non-zero offset ║ N/A ║ N/A ║ Generic "T" ║
╠════════════╬═════════════════╬═══════════════╬═════════════════╬════════════════════╣
║ Dictionary ║ DictionaryIndex ║ Generic "Key" ║ Generic "Value" ║ Tuple (Key, Value) ║
╚════════════╩═════════════════╩═══════════════╩═════════════════╩════════════════════╝

* Index is declared on CollectionType
** Element is declared on GeneratorType and referenced by SequenceType

That Array [T] does not behave like a Dictionary [Int:T] is possibly a sign that an AssociativeCollectionType is needed, something like:

protocol AssociativeCollectionType : CollectionType {
    typealias Key
    typealias Value
    typealias Element = (Key, Value)
    typealias KeySequenceType = AnySequence<Key>
    typealias ValueSequenceType = AnySequence<Value>
    
    var keys: KeySequenceType { get }
    var values: ValueSequenceType { get }
    subscript (key: Key) -> Value? { get set }
    func indexForKey(key:Key) -> Index

    mutating func removeValueForKey(key: Key) -> Value?
    mutating func updateValue(value: Value, forKey key: Key) -> Value?
}

Dictionary would support such a protocol directly. Array and ArraySlice (or even every CollectionType) might have a mapping function (lets bike shed “associative()” for now) to return an implementation of the interface, mapping:

- AssociativeCollectionType.Index = old Index
- AssociativeCollectionType.Key = old Index
- AssociativeCollectionType.Value = old Element
- AssociativeCollectionType.Element = old (Index, Element)

So maybe:

for (index, element) in someArray.associative() { … }

would do what the original idea requested: provide a (Index, Element) sequence for CollectionTypes.

-DW


(Dave Abrahams) #8

Anyway, it would not be correct to ".enumerate()" returns (Index, Element) instead of (n, Element)?

I believe that the current behavior was thought when Slices had indices starting with zero.

The behavior of enumerate is easiest to explain when you give everything a name and lay them all out on the table: In particular, there is a difference between a Counter, an Index, a Key, a Value, and an Element.

Enumerate always works in terms of adding a counter, not an index. It was perhaps better served as a global method, since one cannot really improve its default implementation.

The rest are as follows:

╔════════════╦═════════════════╦═══════════════╦═════════════════╦════════════════════╗
║ Type ║ Index* ║ Key ║ Value ║ Element** ║
╠════════════╬═════════════════╬═══════════════╬═════════════════╬════════════════════╣
║ Array ║ 0-based offset ║ N/A ║ N/A ║ Generic "T" ║
╠════════════╬═════════════════╬═══════════════╬═════════════════╬════════════════════╣
║ ArraySlice ║ non-zero offset ║ N/A ║ N/A ║ Generic "T" ║
╠════════════╬═════════════════╬═══════════════╬═════════════════╬════════════════════╣
║ Dictionary ║ DictionaryIndex ║ Generic "Key" ║ Generic "Value" ║ Tuple (Key, Value) ║
╚════════════╩═════════════════╩═══════════════╩═════════════════╩════════════════════╝

* Index is declared on CollectionType
** Element is declared on GeneratorType and referenced by SequenceType

That Array [T] does not behave like a Dictionary [Int:T] is possibly a sign that an AssociativeCollectionType is needed, something like:

protocol AssociativeCollectionType : CollectionType {
    typealias Key
    typealias Value
    typealias Element = (Key, Value)
    typealias KeySequenceType = AnySequence<Key>
    typealias ValueSequenceType = AnySequence<Value>
    
    var keys: KeySequenceType { get }
    var values: ValueSequenceType { get }
    subscript (key: Key) -> Value? { get set }
    func indexForKey(key:Key) -> Index

    mutating func removeValueForKey(key: Key) -> Value?
    mutating func updateValue(value: Value, forKey key: Key) -> Value?
}

What is the use-case for this protocol?

Dictionary would support such a protocol directly. Array and ArraySlice (or even every CollectionType) might have a mapping function (lets bike shed “associative()” for now) to return an implementation of the interface, mapping:

- AssociativeCollectionType.Index = old Index
- AssociativeCollectionType.Key = old Index
- AssociativeCollectionType.Value = old Element
- AssociativeCollectionType.Element = old (Index, Element)

So maybe:

for (index, element) in someArray.associative() { … }

would do what the original idea requested: provide a (Index, Element) sequence for CollectionTypes.

-DW

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

-Dave

···

On Dec 29, 2015, at 2:37 PM, David Waite via swift-evolution <swift-evolution@swift.org> wrote:


(Dave Abrahams) #9

The current behavior was intentional (and intentionally left as-is when array slices changed); indices are not integers for all collections, and there are many ways to get the indices with the elements, including zip(x.indices, x)

-Dave

···

On Dec 29, 2015, at 8:06 AM, Wallacy via swift-evolution <swift-evolution@swift.org> wrote:

I believe that the current behavior was thought when Slices had indices starting with zero.


(David Waite) #10

I actually suspect you have a better grasp of the pros and cons based on your prior experience, but here goes:

An AssociativeCollectionType would allow for alternate implementations from Dictionary, which provide space optimization, lookup performance, and functionality, to be used closer to the level of first-class citizenship that Dictionary has. This is similar IMHO to the status of say ArraySlice vs Array.

1. While there is a protocol for implementing your own collection, there is none for implementing your own associative collection. Without such a protocol, interfaces must be written to require Dictionary, and to get things into the Dictionary representation (rather than just exposing an associative interface) will require data copying as well as other probably smaller details like the value be Hashable

2. A single dictionary-like type means that features which may be desirable in an associative collection (preserving insertion order, sorting by key, atomicity guarantees, searching by closest match, computed/default/prototype-fallback value lookup) as well as problem-space optimized algorithms (hash vs tree vs splay vs skip list) would each either need to be features of Dictionary, or be second-class citizens.

3. People who wish to have arrays behave like dictionaries (and are willing to take the performance hit re: array index out of bounds returning nil instead of raising a fatal error) could get the appropriate interface to do so

4. Array is mostly described (excluding certain mutating operations) by CollectionType. Dictionary on the other hand only really uses CollectionType to allow iteration and to reference items in buckets (I assume). The Raison d'etre of Dictionary as a core type is just not represented via protocols the way Array is

Some of this though is that many CollectionType and SequenceType operations are useful when considering a Dictionary as a sequence in a functional style - but is not the default way I at least think of Dictionaries. For instance, how often will people want to filter a dictionary into a tuple array, vs mutating or forming a new dictionary with predicate-failing entries removed?

As least in my personal experience, the difference in relationship between Array and CollectionType (Array being a type of collection) and Dictionary and CollectionType (dictionaries can be represented by a collection) is a constant source of confusion.

Some the cons I’ve identified:
- Protocols today do not have generic arguments, making any function declaration wanting to take “any string to string associative collection” vs “[String:String]” be significantly more complex.

- It is more pragmatic and simpler from a type perspective to push code to use a single Array or Dictionary implementation rather than needing to support generic implementations, except when opted into. This is similar to the use of lazy, or the use of Int vs smaller or more specific types.

- Array has additional mutating methods which are not described by CollectionType or any other protocol. The ease of getting a new copy of an array and mutating it means code needing the ability to append (for instance) today will be declared using an explicit Array type. If this was desired to be fixed, it would require more protocols (ModifiableCollectionType?)

- Related to the previous point, there is no way to require value semantics via a protocol. If a ModifiableCollectionType was created today, there would need to be requirements on implementation outside what a protocol can enforce.

- An array could not implement AssociativeCollectionType directly, because the index value and key would both be Int and there is a subscript property for both, but with different semantics and a different return type. You would need to ‘wrap’ an array to get this effect

And one final thought - It might have make sense for Dictionary to not directly be a CollectionType at all, but rather have a property that exposes the Dictionary as a CollectionType, similar to how I propose a wrapping an Array to make it act like an associative array. If you were to take a cue from Java, this would be an “entries” property on Dictionary (or AssociativeCollectionType)

-DW

···

On Dec 31, 2015, at 9:53 AM, Dave Abrahams <dabrahams@apple.com> wrote:

What is the use-case for this protocol?


(Dave Abrahams) #11

What is the use-case for this protocol?

I actually suspect you have a better grasp of the pros and cons based on your prior experience, but here goes:

An AssociativeCollectionType would allow for alternate implementations from Dictionary, which provide space optimization, lookup performance, and functionality, to be used closer to the level of first-class citizenship that Dictionary has.

That’s pretty abstract. Note also that one shouldn’t design protocols without a batch of concrete models that you can examine to see how they should conform.

This is similar IMHO to the status of say ArraySlice vs Array.

1. While there is a protocol for implementing your own collection, there is none for implementing your own associative collection. Without such a protocol, interfaces must be written to require Dictionary, and to get things into the Dictionary representation (rather than just exposing an associative interface) will require data copying as well as other probably smaller details like the value be Hashable

2. A single dictionary-like type means that features which may be desirable in an associative collection (preserving insertion order, sorting by key, atomicity guarantees, searching by closest match, computed/default/prototype-fallback value lookup) as well as problem-space optimized algorithms (hash vs tree vs splay vs skip list) would each either need to be features of Dictionary, or be second-class citizens.

3. People who wish to have arrays behave like dictionaries (and are willing to take the performance hit re: array index out of bounds returning nil instead of raising a fatal error) could get the appropriate interface to do so

4. Array is mostly described (excluding certain mutating operations) by CollectionType. Dictionary on the other hand only really uses CollectionType to allow iteration and to reference items in buckets (I assume). The Raison d'etre of Dictionary as a core type is just not represented via protocols the way Array is

Some of this though is that many CollectionType and SequenceType operations are useful when considering a Dictionary as a sequence in a functional style - but is not the default way I at least think of Dictionaries. For instance, how often will people want to filter a dictionary into a tuple array, vs mutating or forming a new dictionary with predicate-failing entries removed?

As least in my personal experience, the difference in relationship between Array and CollectionType (Array being a type of collection) and Dictionary and CollectionType (dictionaries can be represented by a collection) is a constant source of confusion.

That sounds like a different issue. Are you suggesting dictionaries should have-a collection rather than be-a collection?

Some the cons I’ve identified:
- Protocols today do not have generic arguments, making any function declaration wanting to take “any string to string associative collection” vs “[String:String]” be significantly more complex.

I don’t see why you think that increases complexity. I don’t agree that it does. This also seems like a completely separate issue from whether we have AssociativeCollection.

- It is more pragmatic and simpler from a type perspective to push code to use a single Array or Dictionary implementation rather than needing to support generic implementations, except when opted into. This is similar to the use of lazy, or the use of Int vs smaller or more specific types.

True, and it will always be that way AFAICT. And again this seems like a completely separate issue.

- Array has additional mutating methods which are not described by CollectionType or any other protocol. The ease of getting a new copy of an array and mutating it means code needing the ability to append (for instance) today will be declared using an explicit Array type. If this was desired to be fixed, it would require more protocols (ModifiableCollectionType?)

- Related to the previous point, there is no way to require value semantics via a protocol. If a ModifiableCollectionType was created today, there would need to be requirements on implementation outside what a protocol can enforce.

- An array could not implement AssociativeCollectionType directly, because the index value and key would both be Int and there is a subscript property for both, but with different semantics and a different return type. You would need to ‘wrap’ an array to get this effect

And one final thought - It might have make sense for Dictionary to not directly be a CollectionType at all, but rather have a property that exposes the Dictionary as a CollectionType, similar to how I propose a wrapping an Array to make it act like an associative array. If you were to take a cue from Java, this would be an “entries” property on Dictionary (or AssociativeCollectionType)

I’m clearly not understanding something here. What you’re talking about are all real issues, but they seem unrelated to each other and without any particular focus. Sorry…

-DW

-Dave

···

On Dec 31, 2015, at 11:17 AM, David Waite <david@alkaline-solutions.com> wrote:

On Dec 31, 2015, at 9:53 AM, Dave Abrahams <dabrahams@apple.com <mailto:dabrahams@apple.com>> wrote:


(Dmitri Gribenko) #12

There is a protocol, RangeReplaceableCollectionType.

Dmitri

···

On Thu, Dec 31, 2015 at 9:17 PM, David Waite via swift-evolution < swift-evolution@swift.org> wrote:

- Array has additional mutating methods which are not described by
CollectionType or any other protocol. The ease of getting a new copy of an
array and mutating it means code needing the ability to append (for
instance) today will be declared using an explicit Array type. If this was
desired to be fixed, it would require more protocols
(ModifiableCollectionType?)

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/


(David Waite) #13

Interesting! I never saw any collections outside of the String views implement that, probably because the swift standard library docs don’t publish underscore-prefixed protocols.

-DW

···

On Dec 31, 2015, at 3:34 PM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

On Thu, Dec 31, 2015 at 9:17 PM, David Waite via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

- Array has additional mutating methods which are not described by CollectionType or any other protocol. The ease of getting a new copy of an array and mutating it means code needing the ability to append (for instance) today will be declared using an explicit Array type. If this was desired to be fixed, it would require more protocols (ModifiableCollectionType?)

There is a protocol, RangeReplaceableCollectionType.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com <mailto:gribozavr@gmail.com>>*/


(Dave Abrahams) #14

Interesting! I never saw any collections outside of the String views implement that, probably because the swift standard library docs don’t publish underscore-prefixed protocols.

The Array types all implement it.

-DW

- Array has additional mutating methods which are not described by CollectionType or any other protocol. The ease of getting a new copy of an array and mutating it means code needing the ability to append (for instance) today will be declared using an explicit Array type. If this was desired to be fixed, it would require more protocols (ModifiableCollectionType?)

There is a protocol, RangeReplaceableCollectionType.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com <mailto:gribozavr@gmail.com>>*/

-Dave

···

On Dec 31, 2015, at 6:09 PM, David Waite <david@alkaline-solutions.com> wrote:

On Dec 31, 2015, at 3:34 PM, Dmitri Gribenko <gribozavr@gmail.com <mailto:gribozavr@gmail.com>> wrote:
On Thu, Dec 31, 2015 at 9:17 PM, David Waite via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:


(David Waite) #15

First, let me say that basic premise was intended to be “I would like the ability to have a protocol for dictionary-type objects rather than have it be a single implementation for several reasons, but I expect that there would also be several hurdles to doing so”. It is more of an idea than a design or proposal at this point, partly because I suspect some of the hurdles may either be blocking until other enhancements land in Swift, or that having a concrete dictionary type is just preferred based on the design goals of the language.

My interface was just a copying of the methods on Dictionary which I thought made sense for illustration purposes.

<rest inlnine>

An AssociativeCollectionType would allow for alternate implementations from Dictionary, which provide space optimization, lookup performance, and functionality, to be used closer to the level of first-class citizenship that Dictionary has.

That’s pretty abstract. Note also that one shouldn’t design protocols without a batch of concrete models that you can examine to see how they should conform.

Languages like Java and C# which are interface heavy come with a wide assortment of collections implementations, and third party implementations of even more. I’ve implemented core collection types (in Mono) and several of my own custom implementations over the years in these languages.

And yes, I would expect design and implementation to go hand-in-hand for just about any collections changes - although the current process for swift evolution does not integrate as well as I personally would like with iterative or prototype-driven design, that would be my personal preferred approach for any larger feature, especially one impacting the collections portion of the standard library.

Some of this though is that many CollectionType and SequenceType operations are useful when considering a Dictionary as a sequence in a functional style - but is not the default way I at least think of Dictionaries. For instance, how often will people want to filter a dictionary into a tuple array, vs mutating or forming a new dictionary with predicate-failing entries removed?

As least in my personal experience, the difference in relationship between Array and CollectionType (Array being a type of collection) and Dictionary and CollectionType (dictionaries can be represented by a collection) is a constant source of confusion.

That sounds like a different issue. Are you suggesting dictionaries should have-a collection rather than be-a collection?

Oh boy. Prefacing with “I am unsure if I want to get into that conversation due to the impact being too great for any bike shedding to ever be relevant”, mostly yes. If it is worth further discussion, it is probably worth doing so in a separate thread.

A Dictionary is as representable as a collection as a Set would be. The underlying implementation of both both could just as well be an array of optionals (I’ve implemented one as such).

However, there is a pattern in Swift collections of pushing down as much default implementation as possible balanced against the expense of a naive default implementation. As such, there is a complexity imported directly into Dictionary to deal with what I perceive as a secondary representation of the information in the dictionary - a list of key, value pairs.

Some the cons I’ve identified:
- Protocols today do not have generic arguments, making any function declaration wanting to take “any string to string associative collection” vs “[String:String]” be significantly more complex.

I don’t see why you think that increases complexity. I don’t agree that it does. This also seems like a completely separate issue from whether we have AssociativeCollection.

I was speaking to the declarative complexity and verbosity in today’s syntax:

  func process(data:[String:String])
vs.
  func process<ACT:AssociativeCollectionType where ACT.Key == String, ACT.Value == String>(data:ACT)

This would make it very difficult for one to have a custom implementation of an associative collection which accepted by third party code, simply because the API designers went with the simpler, more readable syntax.

- It is more pragmatic and simpler from a type perspective to push code to use a single Array or Dictionary implementation rather than needing to support generic implementations, except when opted into. This is similar to the use of lazy, or the use of Int vs smaller or more specific types.

True, and it will always be that way AFAICT. And again this seems like a completely separate issue.

I don’t know if it is an issue - it may very well be an explicit design goal to reduce decisions around types. I say that it *may* be an explicit design goal, because some of the cases cited may indeed be the result of the featureset of the current language.

I can’t say if the push toward using Int for everything would be lessened with an updated numerics system, or if operations like map return arrays rather than a new sequence because of current limitations in the generics system - I simply don’t know.

-DW

···

On Dec 31, 2015, at 2:27 PM, Dave Abrahams <dabrahams@apple.com> wrote:

On Dec 31, 2015, at 11:17 AM, David Waite <david@alkaline-solutions.com <mailto:david@alkaline-solutions.com>> wrote: