Proposal: Add a sequence-based initializer to Dictionary


(Nicola Salmoria) #1

To handle the case of duplicate keys, why not allow to pass in a 'combine'
function?
The function could default to a preconditionFailure to be consistent with
the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_
sequence: S, combine: (existing: Value, other: Value) -> Value = {
preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value),
forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New
York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

A brief draft is below... I had mostly written this up before I saw the

thread and Gwendal's similar contribution -- happy to hear feedback and
fold in comments/revisions!

Nate

---

Introduction

The Dictionary type should allow initialization from a sequence of (Key,

Value) tuples.

Motivation

Array and Set both have initializers that create a new instance from a

sequence of elements. The Array initializer is useful for converting other
sequences and collections to the "standard" collection type, but the Set
initializer is essential for recovering set operations after performing any
functional operations on a set. For example, filtering a set produces a
collection without any kind of set operations available:

let numberSet = Set(1 ... 100)
let fivesOnly = numberSet.lazy.filter { $0 % 5 == 0 }

"fivesOnly" is a LazyFilterCollection<Set<Int>> instead of a Set --

sending that back through the Set sequence initializer restores the
expected methods:

let fivesOnlySet = Set(numberSet.lazy.filter { $0 % 5 == 0 })
fivesOnlySet.isSubsetOf(numberSet) // true

Dictionary, on the other hand, has no such initializer, so a similar

operation leaves no room except for building a mutable Dictionary via
iteration or functional methods with dubious performance. These techniques
also don't support type inference from the source sequence, increasing
verbosity:

var viaIteration: [String: Int] = [:]
for (key, value) in evenOnly {
    viaIteration[key] = value
}

let viaFunction: [String: Int] = evenOnly.reduce([:]) { (cumulative,

keyValue) in

    var mutableDictionary = cumulative
    mutableDictionary[keyValue.0] = keyValue.1
    return mutableDictionary
}

Proposed solution

The proposed solution would add an initializer to Dictionary that accepts

any sequence of (Key, Value) tuple pairs, matching the Dictionary's element
type when treated as a sequence:

init<S: SequenceType where S.Generator.Element == Generator.Element>(_

sequence: S)

Instead of the techniques for recovering a Dictionary shown above, the

proposed initializer would allow a much cleaner syntax to be written:

let viaProposed = Dictionary(evenOnly)

Moreover, this new initializer would allow for some convenient uses that

aren't currently possible.

??? Initializing from an array of tuples:

let dictFromArray = Dictionary([("a", 1), ("b", 2), ("c", 3), ("d", 4)])

??? Initializing from a DictionaryLiteral (the type, not an actual

literal):

let literal: DictionaryLiteral = ["a": 1, "b": 2, "c": 3, "d": 4]
let dictFromDL = Dictionary(literal)

?? Initializing from a pair of zipped sequences (examples abound):

let letters = "abcdefghij".characters.lazy.map { String($0) }
let dictFromZip = Dictionary(zip(letters, 1...10))
// ["b": 2, "a": 1, "i": 9, "j": 10, "c": 3, "e": 5, "f": 6, "g": 7, "d":

4, "h": 8]

Potential pitfalls

One caveat is that the new initializer doesn't prevent using a sequence

with multiple identical keys. In such a case, the last key/value would
"win" and exist in the dictionary. Such an initialization is a compile-time
error with a dictionary literal, but succeeds under the new initializer:

let _ = ["z": 1, "z": 2, "z": 3, "z": 4]
// fatal error: Dictionary literal contains duplicate keys
Dictionary([("z", 1), ("z", 2), ("z", 3), ("z", 4)])
// ["z": 4]

This behavior is particularly troublesome when used in conjunction with a

mapping operation that modifies a dictionary's keys, since dictionaries
have no particular guaranteed order:

let overlapping = Dictionary(dictFromArray.lazy.map { (_, value) in ("z",

value) })

// ["z": ???]

While a pitfall, this behavior is less a symptom of the proposed API and

more an inherent problem with recovering a dictionary after modifying its
keys. The current ways of rebuilding a dictionary (as shown above) are just
as susceptible to silently dropping values. Moreover, the sequence-based
initializer for Set exhibits the same behavior, though slightly less
problematic in most cases:

let dividedNumbers = Set(numberSet.map { $0 / 20 })
// {4, 5, 2, 0, 1, 3}

Given the potential lossiness of the initializer, should it use a

parameter name for the sequence? I would suggest not, to match the syntax
of Array.init(_:slight_smile: and Set.init(_:), but a parameter like "collapsingKeys"
would make the risk clear to users.

Detailed design

The implementation is simple enough to show in the proposal:

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given

sequence.

    init<S: SequenceType where S.Generator.Element ==

Generator.Element>(_ sequence: S) {

        self.init()
        for (key, value) in sequence {
            updateValue(value, forKey: key)
        }
    }
}

(As part of the standard library, this could use the nativeUpdateValue

method.)

Impact on existing code

As a new API, this will have no impact on existing code.

Alternatives considered

As suggested in the thread below, a method could be added to SequenceType

that would build a dictionary. This approach seems less of a piece with the
rest of the standard library, and overly verbose when used with a
Dictionary that is only passing through filtering or mapping operations. I
don't think the current protocol extension system could handle a
passthrough case (i.e., something like "extension SequenceType where
Generator.Element == (Key, Value)").

Alternately, the status quo could be maintained. Which would be sad.

>
> Doesn’t Swift prefer initializers?
>
> So let’s build a Dictionary initializer that eats any sequence of (key,

value) pairs:

>
> extension Dictionary {
> init<S: SequenceType where S.Generator.Element == (Key,

Value)>(keyValueSequence s: S) {

> self.init()
> for (key, value) in s {
> self[key] = value
> }
> }
> }
>
> do {
> // From array of (key, value) pairs
> let input = [("foo", 1), ("bar", 2)]
> let d = Dictionary(keyValueSequence: input)
> print(d)
> }
> do {
> // From another dictionary
> let input = [1: "foo", 2: "bar"]
> let d = Dictionary(keyValueSequence: input)
> print(d)
> }
> do {
> // Reverse key and values
> let input = [1: "foo", 2: "bar"]
> let d = Dictionary(keyValueSequence: input.map { ($1, $0) })
> print(d)
> }
>
> Gwendal
>
>>
>> I'd prefer "mapToDict" otherwise it sounds like a dictionary gets

mapped, at least for me.

>>
>> -Thorsten
>>

<swift-evolution at swift.org <mailto:swift-evolution at swift.org>>:

>>>
>>> This solution looks great! How do you feel about “mapDict”?
>>>
>>> -Kenny
>>>
>>>
>>>>>
>>>>>
>>>>>
>>>>> I named the method(s) „toDict“ instead of „map“ because map

normally returns a collection which is either the same as the receiver or a
simple one.

>>>>> The second version is more general and allows to do things like
>>>>>
>>>>> let dict = ["Tom", "Dick", "Harry"].enumerate().toDict { (index,

value) in (index + 1, value) }

>>>>
>>>> Map would probably be a more correct mathematically speaking — but

it would be inconsistent with the naming convention already chosen for
Swift. So for Swift - toDict (or toDictionary) would be the best choice.

···

> On Jan 13, 2016, at 11:55 AM, Gwendal Roué via swift-evolution <swift-evolution at swift.org> wrote:
>> Le 13 janv. 2016 à 18:41, Thorsten Seitz via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> a écrit :
>>> Am 13.01.2016 um 17:13 schrieb Kenny Leung via swift-evolution
>>>>> On Jan 12, 2016, at 10:28 AM, Craig Cruden <ccruden at novafore.com <mailto:ccruden at novafore.com>> wrote:
>>>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution


(Alan Skipp) #2

I’ve been absorbed in the world of Monoids lately, so I find the suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys rather nicely. Only thing I’m not too sure about is the idea of failing by default on duplicate keys?

···

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution <swift-evolution@swift.org> wrote:

To handle the case of duplicate keys, why not allow to pass in a 'combine' function?
The function could default to a preconditionFailure to be consistent with the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value) -> Value = { preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value), forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to Monoids, which would mean the combine function would be taken from the supplied Monoid values (I understand I’ve departed to fantasy island at this point, but one can dream : )

Al


(Donnacha Oisín Kidney) #3

I think the idea of a combine closure is great - it lets users of the initialiser decide on whether or not the want duplicate keys to be an error. Also, the default “silently keep the last value” behaviour can be provided as the default closure. I also think that a closure (probably) gives enough flexibility to do most of the interesting things you might do with dictionaries (a “frequencies” function, or “categorise”, for instance).

extension Dictionary {
  /// Creates a dictionary with the keys and values in the given sequence.
  /// Takes a closure which is called on the values of duplicate keys. The
  /// default behaviour is that the last value for a duplicate key is kept.
  init<S: SequenceType where S.Generator.Element == (Key,Value)>
    (_ sequence: S, combine: (Value,Value) -> Value = { (_,snd) in snd } ) {
    self.init()
    for (k,v) in sequence {
      self[k] = self[k].map { fst in combine(fst,v) } ?? v
    }
  }
}

extension SequenceType where Generator.Element: Hashable {
  func frequencies() -> [Generator.Element:Int] {
    return Dictionary(self.lazy.map { v in (v,1) }, combine: +)
  }
}

"hello".characters.frequencies() // ["e": 1, "o": 1, "l": 2, "h": 1]

···

On 15 Jan 2016, at 10:53, Alan Skipp via swift-evolution <swift-evolution@swift.org> wrote:

I’ve been absorbed in the world of Monoids lately, so I find the suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys rather nicely. Only thing I’m not too sure about is the idea of failing by default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

To handle the case of duplicate keys, why not allow to pass in a 'combine' function?
The function could default to a preconditionFailure to be consistent with the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value) -> Value = { preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value), forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to Monoids, which would mean the combine function would be taken from the supplied Monoid values (I understand I’ve departed to fantasy island at this point, but one can dream : )

Al

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Nicola Salmoria) #4

I'm ambivalent about the preconditionFailure. Since there would otherwise
be silent loss of data, I think it fits Swift's "safe by default" paradigm.
It's also consistent with what the normal initialization from a
DictionaryLiteral does.
However, I can also see how it might be more convenient to just pick the
last value.

Nicola

···

On Fri, Jan 15, 2016 at 11:53 AM, Alan Skipp <al_skipp@icloud.com> wrote:

I’ve been absorbed in the world of Monoids lately, so I find the
suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys
rather nicely. Only thing I’m not too sure about is the idea of failing by
default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution < > swift-evolution@swift.org> wrote:

To handle the case of duplicate keys, why not allow to pass in a 'combine'
function?
The function could default to a preconditionFailure to be consistent with
the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given
sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_
sequence: S, combine: (existing: Value, other: Value) -> Value = {
preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value),
forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New
York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to
Monoids, which would mean the combine function would be taken from the
supplied Monoid values (I understand I’ve departed to fantasy island at
this point, but one can dream : )

Al


(Thorsten Seitz) #5

+1

-Thorsten

···

Am 15.01.2016 um 15:32 schrieb Donnacha Oisín Kidney via swift-evolution <swift-evolution@swift.org>:

I think the idea of a combine closure is great - it lets users of the initialiser decide on whether or not the want duplicate keys to be an error. Also, the default “silently keep the last value” behaviour can be provided as the default closure. I also think that a closure (probably) gives enough flexibility to do most of the interesting things you might do with dictionaries (a “frequencies” function, or “categorise”, for instance).

extension Dictionary {
  /// Creates a dictionary with the keys and values in the given sequence.
  /// Takes a closure which is called on the values of duplicate keys. The
  /// default behaviour is that the last value for a duplicate key is kept.
  init<S: SequenceType where S.Generator.Element == (Key,Value)>
    (_ sequence: S, combine: (Value,Value) -> Value = { (_,snd) in snd } ) {
    self.init()
    for (k,v) in sequence {
      self[k] = self[k].map { fst in combine(fst,v) } ?? v
    }
  }
}

extension SequenceType where Generator.Element: Hashable {
  func frequencies() -> [Generator.Element:Int] {
    return Dictionary(self.lazy.map { v in (v,1) }, combine: +)
  }
}

"hello".characters.frequencies() // ["e": 1, "o": 1, "l": 2, "h": 1]

On 15 Jan 2016, at 10:53, Alan Skipp via swift-evolution <swift-evolution@swift.org> wrote:

I’ve been absorbed in the world of Monoids lately, so I find the suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys rather nicely. Only thing I’m not too sure about is the idea of failing by default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution <swift-evolution@swift.org> wrote:

To handle the case of duplicate keys, why not allow to pass in a 'combine' function?
The function could default to a preconditionFailure to be consistent with the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value) -> Value = { preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value), forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to Monoids, which would mean the combine function would be taken from the supplied Monoid values (I understand I’ve departed to fantasy island at this point, but one can dream : )

Al

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Nate Cook) #6

Adding a `combine` closure with Donnacha's default seems like a pretty good solution for this, but giving a `try`-marked closure parameter a default messes with the rethrows behavior and requires a try on every call. I think it's important that when used by default, this initializer has the same behavior as looping over the sequence and setting values for keys. That is, it should replicate:

for (key, value) in sequence {
    newDictionary[key] = value
}

and use the last value for any duplicate keys, rather than failing or trapping.

To handle this properly we'd need two new initializers:

init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S)

init<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value) rethrows

Perhaps we also need a mutating `mergeContentsOf` function with the same signatures as the initializers:

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S)

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value) rethrows

I'm pretty sure I would use all four of those, and that would bring Dictionary more into alignment with how you can use Array and Set. Would a slightly expanded proposal make sense?

Nate

···

On Jan 15, 2016, at 9:01 AM, Nicola Salmoria via swift-evolution <swift-evolution@swift.org> wrote:

I'm ambivalent about the preconditionFailure. Since there would otherwise be silent loss of data, I think it fits Swift's "safe by default" paradigm. It's also consistent with what the normal initialization from a DictionaryLiteral does.
However, I can also see how it might be more convenient to just pick the last value.

Nicola

On Fri, Jan 15, 2016 at 11:53 AM, Alan Skipp <al_skipp@icloud.com <mailto:al_skipp@icloud.com>> wrote:
I’ve been absorbed in the world of Monoids lately, so I find the suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys rather nicely. Only thing I’m not too sure about is the idea of failing by default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

To handle the case of duplicate keys, why not allow to pass in a 'combine' function?
The function could default to a preconditionFailure to be consistent with the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value) -> Value = { preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value), forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to Monoids, which would mean the combine function would be taken from the supplied Monoid values (I understand I’ve departed to fantasy island at this point, but one can dream : )

Al

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution


(Nicola Salmoria) #7

+1 for the mutating methods. That's a pretty common use case and would
nicely complement the Dictionary features.

As for the implementation details, I think all the examples we made are
inefficient, requiring two dictionary lookups to combine a value. In e.g.
C++ you can do something like this

auto it = m.find(x);
if (it != end(m)) {
   it->second = value;
}

In Swift we have

func indexForKey(key: Key) -> DictionaryIndex<Key, Value>?

but then there are only

subscript(position: DictionaryIndex<Key, Value>) -> (Key, Value) { get }
mutating func removeAtIndex(index: DictionaryIndex<Key, Value>) -> (Key,
Value)

It would probably be appropriate to have a new method

mutating func updateValue(value: Value, atIndex: DictionaryIndex<Key,

) -> Value

which would allow to efficiently combine the values with a single lookup.

Nicola

···

On Sat, Jan 16, 2016 at 8:28 AM, Nate Cook <natecook@gmail.com> wrote:

Adding a `combine` closure with Donnacha's default seems like a pretty
good solution for this, but giving a `try`-marked closure parameter a
default messes with the rethrows behavior and requires a try on every
call. I think it's important that when used by default, this initializer
has the same behavior as looping over the sequence and setting values for
keys. That is, it should replicate:

for (key, value) in sequence {
    newDictionary[key] = value
}

and use the last value for any duplicate keys, rather than failing or
trapping.

To handle this properly we'd need two new initializers:

init<S: SequenceType where S.Generator.Element == Generator.Element>(_
sequence: S)

init<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value)
rethrows

Perhaps we also need a mutating `mergeContentsOf` function with the same
signatures as the initializers:

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element
== Generator.Element>(_ sequence: S)

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element
== Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value)
rethrows

I'm pretty sure I would use all four of those, and that would bring
Dictionary more into alignment with how you can use Array and Set. Would a
slightly expanded proposal make sense?

Nate

On Jan 15, 2016, at 9:01 AM, Nicola Salmoria via swift-evolution < > swift-evolution@swift.org> wrote:

I'm ambivalent about the preconditionFailure. Since there would otherwise
be silent loss of data, I think it fits Swift's "safe by default" paradigm.
It's also consistent with what the normal initialization from a
DictionaryLiteral does.
However, I can also see how it might be more convenient to just pick the
last value.

Nicola

On Fri, Jan 15, 2016 at 11:53 AM, Alan Skipp <al_skipp@icloud.com> wrote:

I’ve been absorbed in the world of Monoids lately, so I find the
suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys
rather nicely. Only thing I’m not too sure about is the idea of failing by
default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution < >> swift-evolution@swift.org> wrote:

To handle the case of duplicate keys, why not allow to pass in a
'combine' function?
The function could default to a preconditionFailure to be consistent with
the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given
sequence.
    init<S: SequenceType where S.Generator.Element ==
Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value)
-> Value = { preconditionFailure("Sequence contains duplicate keys");
return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value),
forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New
York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to
Monoids, which would mean the combine function would be taken from the
supplied Monoid values (I understand I’ve departed to fantasy island at
this point, but one can dream : )

Al

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Thorsten Seitz) #8

+1
Splitting the init seems to be a good idea, though I would have thought that a non-throwing default would not require a try on a call using the default (I'm not near a Swift compiler currently, so can't check that right now).
I like the mutating functions!

-Thorsten

···

Am 16.01.2016 um 08:28 schrieb Nate Cook via swift-evolution <swift-evolution@swift.org>:

Adding a `combine` closure with Donnacha's default seems like a pretty good solution for this, but giving a `try`-marked closure parameter a default messes with the rethrows behavior and requires a try on every call. I think it's important that when used by default, this initializer has the same behavior as looping over the sequence and setting values for keys. That is, it should replicate:

for (key, value) in sequence {
    newDictionary[key] = value
}

and use the last value for any duplicate keys, rather than failing or trapping.

To handle this properly we'd need two new initializers:

init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S)

init<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value) rethrows

Perhaps we also need a mutating `mergeContentsOf` function with the same signatures as the initializers:

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S)

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value) rethrows

I'm pretty sure I would use all four of those, and that would bring Dictionary more into alignment with how you can use Array and Set. Would a slightly expanded proposal make sense?

Nate

On Jan 15, 2016, at 9:01 AM, Nicola Salmoria via swift-evolution <swift-evolution@swift.org> wrote:

I'm ambivalent about the preconditionFailure. Since there would otherwise be silent loss of data, I think it fits Swift's "safe by default" paradigm. It's also consistent with what the normal initialization from a DictionaryLiteral does.
However, I can also see how it might be more convenient to just pick the last value.

Nicola

On Fri, Jan 15, 2016 at 11:53 AM, Alan Skipp <al_skipp@icloud.com> wrote:
I’ve been absorbed in the world of Monoids lately, so I find the suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys rather nicely. Only thing I’m not too sure about is the idea of failing by default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution <swift-evolution@swift.org> wrote:

To handle the case of duplicate keys, why not allow to pass in a 'combine' function?
The function could default to a preconditionFailure to be consistent with the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value) -> Value = { preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value), forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to Monoids, which would mean the combine function would be taken from the supplied Monoid values (I understand I’ve departed to fantasy island at this point, but one can dream : )

Al

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Thorsten Seitz) #9

Alternatively the update method could take a combine function.
But I like the idea with using the dictionary index as opposed to the key.

-Thorsten

···

Am 16.01.2016 um 11:28 schrieb Nicola Salmoria via swift-evolution <swift-evolution@swift.org>:

+1 for the mutating methods. That's a pretty common use case and would nicely complement the Dictionary features.

As for the implementation details, I think all the examples we made are inefficient, requiring two dictionary lookups to combine a value. In e.g. C++ you can do something like this

auto it = m.find(x);
if (it != end(m)) {
   it->second = value;
}

In Swift we have

func indexForKey(key: Key) -> DictionaryIndex<Key, Value>?

but then there are only

subscript(position: DictionaryIndex<Key, Value>) -> (Key, Value) { get }
mutating func removeAtIndex(index: DictionaryIndex<Key, Value>) -> (Key, Value)

It would probably be appropriate to have a new method

mutating func updateValue(value: Value, atIndex: DictionaryIndex<Key, Value>) -> Value

which would allow to efficiently combine the values with a single lookup.

Nicola

On Sat, Jan 16, 2016 at 8:28 AM, Nate Cook <natecook@gmail.com> wrote:
Adding a `combine` closure with Donnacha's default seems like a pretty good solution for this, but giving a `try`-marked closure parameter a default messes with the rethrows behavior and requires a try on every call. I think it's important that when used by default, this initializer has the same behavior as looping over the sequence and setting values for keys. That is, it should replicate:

for (key, value) in sequence {
    newDictionary[key] = value
}

and use the last value for any duplicate keys, rather than failing or trapping.

To handle this properly we'd need two new initializers:

init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S)

init<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value) rethrows

Perhaps we also need a mutating `mergeContentsOf` function with the same signatures as the initializers:

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S)

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value) rethrows

I'm pretty sure I would use all four of those, and that would bring Dictionary more into alignment with how you can use Array and Set. Would a slightly expanded proposal make sense?

Nate

On Jan 15, 2016, at 9:01 AM, Nicola Salmoria via swift-evolution <swift-evolution@swift.org> wrote:

I'm ambivalent about the preconditionFailure. Since there would otherwise be silent loss of data, I think it fits Swift's "safe by default" paradigm. It's also consistent with what the normal initialization from a DictionaryLiteral does.
However, I can also see how it might be more convenient to just pick the last value.

Nicola

On Fri, Jan 15, 2016 at 11:53 AM, Alan Skipp <al_skipp@icloud.com> wrote:
I’ve been absorbed in the world of Monoids lately, so I find the suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys rather nicely. Only thing I’m not too sure about is the idea of failing by default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution <swift-evolution@swift.org> wrote:

To handle the case of duplicate keys, why not allow to pass in a 'combine' function?
The function could default to a preconditionFailure to be consistent with the DictionaryLiteral behavior, but be overridden by the caller as needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given sequence.
    init<S: SequenceType where S.Generator.Element == Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value) -> Value = { preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value), forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to Monoids, which would mean the combine function would be taken from the supplied Monoid values (I understand I’ve departed to fantasy island at this point, but one can dream : )

Al

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Andrew Bennett) #10

+1 I've been wishing the protocol had something like this.

It would be nice if a protocol extension had default implementations
similar to these:

init<S: SequenceType where S.Generator.Element == Generator.Element>(_
sequence: S)
{
    self.init(minimumCapacity: sequence.underestimateCount())
    self.mergeContentsOf(sequence)
}

init<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value = {
(_,x) in x }) rethrows
{
    self.init(minimumCapacity: sequence.underestimateCount())
    try self.mergeContentsOf(sequence, combine: combine)
}

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element ==
Generator.Element>
    (_ sequence: S)
{
    for (key, value) in sequence {
        self[key] = value
    }
}

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element ==
Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value = {
(_,x) in x }) rethrows
{
    for (key, value) in sequence {
        if let oldValue = self[key] {
            self[key] = try combine(oldValue, value)
        }
        else {
            self[key] = value
        }
    }
}

···

On Mon, Jan 18, 2016 at 5:04 PM, Thorsten Seitz via swift-evolution < swift-evolution@swift.org> wrote:

Alternatively the update method could take a combine function.
But I like the idea with using the dictionary index as opposed to the key.

-Thorsten

Am 16.01.2016 um 11:28 schrieb Nicola Salmoria via swift-evolution < > swift-evolution@swift.org>:

+1 for the mutating methods. That's a pretty common use case and would
nicely complement the Dictionary features.

As for the implementation details, I think all the examples we made are
inefficient, requiring two dictionary lookups to combine a value. In e.g.
C++ you can do something like this

auto it = m.find(x);
if (it != end(m)) {
   it->second = value;
}

In Swift we have

func indexForKey(key: Key) -> DictionaryIndex<Key, Value>?

but then there are only

subscript(position: DictionaryIndex<Key, Value>) -> (Key, Value) { get }
mutating func removeAtIndex(index: DictionaryIndex<Key, Value>) -> (Key,
Value)

It would probably be appropriate to have a new method

mutating func updateValue(value: Value, atIndex: DictionaryIndex<Key,
>) -> Value

which would allow to efficiently combine the values with a single lookup.

Nicola

On Sat, Jan 16, 2016 at 8:28 AM, Nate Cook <natecook@gmail.com> wrote:

Adding a `combine` closure with Donnacha's default seems like a pretty
good solution for this, but giving a `try`-marked closure parameter a
default messes with the rethrows behavior and requires a try on every
call. I think it's important that when used by default, this initializer
has the same behavior as looping over the sequence and setting values for
keys. That is, it should replicate:

for (key, value) in sequence {
    newDictionary[key] = value
}

and use the last value for any duplicate keys, rather than failing or
trapping.

To handle this properly we'd need two new initializers:

init<S: SequenceType where S.Generator.Element == Generator.Element>(_
sequence: S)

init<S: SequenceType where S.Generator.Element == Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value)
rethrows

Perhaps we also need a mutating `mergeContentsOf` function with the same
signatures as the initializers:

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element
== Generator.Element>(_ sequence: S)

mutating func mergeContentsOf<S: SequenceType where S.Generator.Element
== Generator.Element>
    (_ sequence: S, @noescape combine: (Value, Value) throws -> Value)
rethrows

I'm pretty sure I would use all four of those, and that would bring
Dictionary more into alignment with how you can use Array and Set. Would a
slightly expanded proposal make sense?

Nate

On Jan 15, 2016, at 9:01 AM, Nicola Salmoria via swift-evolution < >> swift-evolution@swift.org> wrote:

I'm ambivalent about the preconditionFailure. Since there would otherwise
be silent loss of data, I think it fits Swift's "safe by default" paradigm.
It's also consistent with what the normal initialization from a
DictionaryLiteral does.
However, I can also see how it might be more convenient to just pick the
last value.

Nicola

On Fri, Jan 15, 2016 at 11:53 AM, Alan Skipp <al_skipp@icloud.com> wrote:

I’ve been absorbed in the world of Monoids lately, so I find the
suggestion below to be particularly brilliant. : )
It solves the issue of arbitrarily choosing the value for duplicate keys
rather nicely. Only thing I’m not too sure about is the idea of failing by
default on duplicate keys?

On 15 Jan 2016, at 10:18, Nicola Salmoria via swift-evolution < >>> swift-evolution@swift.org> wrote:

To handle the case of duplicate keys, why not allow to pass in a
'combine' function?
The function could default to a preconditionFailure to be consistent
with the DictionaryLiteral behavior, but be overridden by the caller as
needed.

extension Dictionary {
    /// Creates a dictionary with the keys and values in the given
sequence.
    init<S: SequenceType where S.Generator.Element ==
Generator.Element>(_ sequence: S, combine: (existing: Value, other: Value)
-> Value = { preconditionFailure("Sequence contains duplicate keys");
return $1 } ) {
        self.init()
        for (key, value) in sequence {
            if let existing = updateValue(value, forKey: key) {
                updateValue(combine(existing: existing, other: value),
forKey: key)
            }
        }
    }
}

usage examples:

let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New
York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]

let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]

Nicola

It’d be great if there was also an init that restricted the Values to
Monoids, which would mean the combine function would be taken from the
supplied Monoid values (I understand I’ve departed to fantasy island at
this point, but one can dream : )

Al

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution