[Draft] Dictionary & Set Enhancements


(Nate Cook) #1

Hello everyone!

Here is a draft proposal to fill in some usability gaps in the standard library Dictionary and Set types. I’d love to hear any feedback from the list before submitting this as a PR.

Thanks!

Nate

Dictionary & Set Enhancements

- Proposal: SE-0000

  • Author: Nate Cook
  • Review Manager: TBD
  • Status: Awaiting review

Introduction

This proposal comprises a variety of commonly (and less commonly) suggested improvements to the standard library’s Dictionary type, from merging initializers to dictionary-specific filter and mapValues methods. The proposed additions to Dictionary, and the corresponding changes to Set, are detailed in the sections below.

···

1. Merging initializers and methods

The Dictionary type should allow initialization from a sequence of (Key, Value) tuples and offer methods that merge a sequence of (Key, Value) tuples into a new or existing dictionary, using a closure to combine values for duplicate keys.

  • First message of discussion thread
  • Initial proposal draft
  • Prior standalone proposal (SE-100)
    Array and Set both have initializers that create a new instance from a sequence of elements. The Array initializer is useful for converting other sequences and collections to the “standard” collection type, while the Set initializer is essential for recovering set operations after performing any functional operations on a set. For example, filtering a set produces a collection without any set operations available.
let numberSet = Set(1 ... 100)
let fivesOnly = numberSet.lazy.filter { $0 % 5 == 0 }

fivesOnly is a LazyFilterCollection<Set<Int>> instead of a Set—sending that back through the Set sequence initializer restores the expected methods.

let fivesOnlySet = Set(numberSet.lazy.filter { $0 % 5 == 0 })
fivesOnlySet.isSubsetOf(numberSet) // true

Dictionary, on the other hand, has no such initializer, so a similar operation leaves no room to recover dictionary functionality without building a mutable Dictionary via iteration or functional methods. These techniques also don’t support type inference from the source sequence, increasing verbosity.

let numberDictionary = ["one": 1, "two": 2, "three": 3, "four": 4]
let evenOnly = numberDictionary.lazy.filter { (_, value) in
    value % 2 == 0
}
var viaIteration: [String: Int] = [:]
for (key, value) in evenOnly {
viaIteration[key] = value
}
let viaReduce: [String: Int] = evenOnly.reduce([:]) { (cumulative, kv) in
    var dict = cumulative
dict[kv.key] = kv.value
    return dict
}

Beyond initialization, Array and Set both also provide a method to add a new block of elements to an existing collection. Array provides this via append(contentsOf:) for the common appending case or replaceSubrange(_:with:) for general inserting or replacing, while the unordered Set type lets you pass any sequence to unionInPlace(_:) to add elements to an existing set.

Once again, Dictionary has no corresponding API – looping and adding elements one at a time as shown above is the only way to merge new elements into an existing dictionary.

Proposed solution

This proposal puts forward two new ways to convert (Key, Value) sequences to dictionary form: a full-width, failable initializer and a set of merging APIs that handle input data with duplicate keys.

Sequence-based initializer

The proposed solution would add a new, failable initializer to Dictionary that accepts any sequence of (Key, Value)tuple pairs.

init?<S: Sequence where S.Iterator.Element == (key: Key, value: Value    )>(
_ keysAndValues: S)

Instead of the techniques for recovering a Dictionary instance shown above, the proposed initializer would allow a much cleaner syntax.

let viaProposed = Dictionary(evenOnly)!

Like Array.init(_:) and Set.init(_:), this is a full-width initializer. To ensure this, the initializer requires that each key in the supplied sequence is unique, and returns nil whenever that condition isn’t met. This model prevents accidentally dropping values for keys that might be duplicated, but allows easier recovery than the trap that results from duplicate keys in a dictionary literal.

The new initializer allows for some convenient uses that aren’t currently possible.

  • Initializing from a DictionaryLiteral (the type, not an actual literal)
  let literal: DictionaryLiteral = ["a": 1, "b": 2, "c": 3, "d": 4]
  let dictFromDL = Dictionary(literal)!
  • Swapping keys and values of an existing dictionary
  guard let reversedDict = Dictionary(dictFromDL.map { ($1, $0) }) else { throw Errors.reversalFailed }
  // [2: "b", 4: "d", 1: "a", 3: "c"]
  • Converting an array to an indexed dictionary (popular on the thread)
  let names = ["Cagney", "Lacey", "Bensen"]
  let dict = Dictionary(names.enumerated().map { (i, val) in (i + 1, val) })!
  // [2: "Lacey", 3: "Bensen", 1: "Cagney"]
  • Initializing from a pair of zipped sequences (examples abound)
  let letters = "abcdef".characters.lazy.map(String.init)
  let dictFromZip = Dictionary(zip(letters, 1...10))!
  // ["b": 2, "e": 5, "a": 1, "f": 6, "d": 4, "c": 3]

This particular use is currently blocked by SR-922. As a workaround, add .map {(key: $0, value: $1)}.

Merging initializer and methods

Creating a Dictionary from a dictional literal currently checks the keys for uniqueness, trapping on a duplicate. The sequence-based initializer shown above has the same requirements, failing and returning nil when encountering duplicate keys.

let duplicates: DictionaryLiteral = ["a": 1, "b": 2, "a": 3, "b": 4]
let letterDict = Dictionary(duplicates)
// nil

However, some use cases can be forgiving of duplicate keys, so this proposal includes a second new initializer. This initializer allows the caller to supply, along with the sequence, a combining closure that’s called with the old and new values for any duplicate keys.

init<S: Sequence    >(
merging keysAndValues    : S, resolvingCollisionsWith combine: (Value, Value) throws -> Value
) rethrows where S.Iterator.Element == (key: Key, value: Value)

This example shows how one could keep the first value of all those supplied for a duplicate key.

let letterDict2 = Dictionary(merging: duplicates, resolvingCollisionsWith: { (first, _) in first })
// ["b": 2, "a": 1]

Or the largest value for any duplicate keys.

let letterDict3 = Dictionary(merging: duplicates, resolvingCollisionsWith: max)
// ["b": 4, "a": 3]

At other times the merging initializer could be used to combine values for duplicate keys. Donnacha Oisín Kidney wrote a neat frequencies() method for sequences as an example of such a use in the thread.

extension Sequence where Iterator.Element: Hashable {
func frequencies() -> [Iterator.Element: Int        ] {
return Dictionary(merging: self.lazy.map { v in (v, 1) }, resolvingCollisionsWith: +    )
}
}
[1, 2, 2, 3, 1, 2, 4, 5, 3, 2, 3, 1].frequencies()
// [2: 4, 4: 1, 5: 1, 3: 3, 1: 3]

This proposal also includes new mutating and non-mutating methods for Dictionary that merge the contents of a sequence of (Key, Value) tuples into an existing dictionary, merge(contentsOf:) and merged(with:).

mutating func merge<S: Sequence    >(
contentsOf other    : S, resolvingCollisionsWith combine: (Value, Value) throws -> Value
) rethrows where S.Iterator.Element == (key: Key, value: Value)
func merged<S: Sequence    >(
with other    : S, resolvingCollisionsWith combine: (Value, Value) throws -> Value
) rethrows -> [Key: Value] where S.Iterator.Element == (key: Key, value: Value)

As above, there are a wide variety of uses for the merge.

// Adding default values
let defaults: [String: Bool] = ["foo": false, "bar": false, "baz": false]
var options: [String: Bool] = ["foo": true, "bar": false]
options.merge(contentsOf: defaults) { (old, _) in old }
// options is now ["foo": true, "bar": false, "baz": false]

// Summing counts repeatedly
var bugCounts: [String: Int] = ["bees": 9, "ants": 112, ...]
while bugCountingSource.hasMoreData    () {
bugCounts.merge(contentsOf: bugCountingSource.countMoreBugs(), resolvingCollisionsWith: +)
}

To simplify two common uses of the merging initializer, this proposal includes the addition of two new top-level functions to the standard library: first(_:_:) and last(_:_:), which return their first and last arguments, respectively. These new functions can be passed instead of a custom closure:

let firstWins = Dictionary(merging: duplicates, resolvingCollisionsWith: first)
// ["b": 2, "a": 1]

let lastWins = Dictionary(merging: duplicates, resolvingCollisionsWith: last)
// ["b": 4, "a": 3]

Alternatives Considered

The first(_:_:) and last(_:_:) functions may not pull their weight as top-level symbols. Instead, at the cost of additional overloads for the merging initializer and methods, we could allow users to specify the useFirst and useLast cases of a MergeCollisionStrategy enumeration.

extension Dictionary {
/// The strategy to use when merging a sequence of key-value pairs into a dictionary.
    enum MergeCollisionStrategy {
/// If there is more than one instance of a key in the sequence to merge, use
        /// only the first value for the dictionary.
        case useFirst
        /// If there is more than one instance of a key in the sequence to merge, use
        /// only the last value for the dictionary.
        case useLast
    }
init<S: Sequence        >(
merging keysAndValues        : S, resolvingCollisionsWith strategy    : MergeCollisionStrategy
)
// other merging overloads
}

In use, this overload would look similar to the functional version, but may aid in discoverability:

let firstWins = Dictionary(merging: duplicates, resolvingCollisionsWith: .useFirst)
// ["b": 2, "a": 1]

let lastWins = Dictionary(merging: duplicates, resolvingCollisionsWith: .useLast)
// ["b": 4, "a": 3]

2. Key-based subscript with default value

Another common challenge with dictionaries is iteratively making changes to key/value pairs that may or may not already be present. For example, to iteratively add count the frequencies of letters in a string, one might write something like the following:

let source = "how now brown cow"
var frequencies: [Character: Int] = [:]
for c in source.characters {
if frequencies[c] == nil {
frequencies[c] = 1
    } else {
frequencies[c]! += 1
    }
}

Testing for nil and assigning through the force unwrapping operator are awkward at best for such a common operation. Furthermore, the Optional<Value> return type of the current keyed subscript complicates efficiencies that could be gained for this type of modify action under a future ownership model.

Proposed solution

A keyed subscript with a default value neatly simplifies this usage. Instead of checking for nil, one can pass the default value along with the key as a default subscript parameter.

let source = "how now brown cow"
var frequencies: [Character: Int] = [:]
for c in source.characters {
frequencies[c, default: 0] += 1
}

The return type of this subscript is a non-optional Value. Note that accessing the subscript as a getter never stores the default value in the dictionary—the following two lines are equivalent:

let x = frequencies["a", default: 0]
let y = frequencies["a"] ?? 0

3. Dictionary-specific map and filter

The standard map and filter methods, while always useful and beloved, aren’t ideal when applied to dictionaries. In both cases, the desired end-result is frequently another dictionary instead of an array of key-value pairs—even with the sequence-based initializer proposed above this is an inefficient way of doing things.

Additionally, the standard map method doesn’t gracefully support passing a function when transforming only the values of a dictionary. The transform function must accept and return key/value pairs, necessitating a custom closure in nearly every case.

Assuming the addition of a sequence-based initializer, the current filter and map look like the following:

let numbers = ["one": 1, "two": 2, "three": 3, "four": 4]
let evens = Dictionary(numbers.lazy.filter { $0.value % 2 == 0 })!
// ["four": 4, "two": 2]
let strings = Dictionary(numbers.lazy.map { (k, v) in (k, String(v)) })!
// ["three": "3", "four": "4", "one": "1", "two": "2"]

Proposed solution

This proposal adds two new methods for Dictionary:

  1. A mapValues method that keeps a dictionary’s keys intact while transforming the values. Mapping a dictionary’s key/value pairs can’t always produce a new dictionary, due to the possibility of key collisions, but mapping only the values can produce a new dictionary with the same underlying layout as the original.
  let strings = numbers.mapValues(String.init)
  // ["three": "3", "four": "4", "one": "1", "two": "2"]
  1. A Dictionary-returning filter method. While transforming the keys and values of a dictionary can result in key collisions, filtering the elements of a dictionary can at worst replicate the entire dictionary.
  let evens = numbers.filter { $0.value % 2 == 0 }
  // ["four": 4, "two": 2]

Both of these can be made significantly more efficient than their Sequence-sourced counterparts. For example, the mapValues method can simply copy the portion of the storage that holds the keys to the new dictionary before transforming the values.


4. Visible dictionary capacity

As you add elements to a dictionary, it automatically grows its backing storage as necessary. This reallocation is a significant operation—unlike arrays, where the existing elements can be copied to a new block of storage en masse, every key/value pair must be moved over individually, recalculating the hash value for the key to find its position in the larger backing buffer.

While dictionaries uses an exponential growth strategy to make this as efficient as possible, beyond the init(minimumCapacity:) initializer they do not expose a way to reserve a specific capacity. In addition, adding a key/value pair to a dictionary is guaranteed not to invalidate existing indices as long as the capacity doesn’t change, yet we don’t provide any way of seeing a dictionary’s current or post-addition capacity.

Proposed solution

Dictionary should add a capacity property and a reserveCapacity(_:) method, like those used in range-replaceable collections. The capacity of a dictionary is the number of elements it can hold without reallocating a larger backing storage, while calling reserveCapacity(_:) allocates a large enough buffer to hold the requested number of elements without reallocating.

var numbers = ["one": 1, "two": 2, "three": 3, "four": 4]
numbers.capacity   // 6
numbers.reserveCapacity(20)
numbers.capacity   // 24

Because hashed collections use extra storage capacity to reduce the likelihood and cost of collisions, the value of the capacity property won’t be equal to the actual size of the backing storage. Likewise, the capacity after calling reserveCapacity(_:) will be at least as large as the argument, but usually larger. (In its current implementation, Dictionary always has a power of 2-sized backing storage.)


5. Dictionary.remove(at:) returns next index

The method dictionaries use to store their key/value pairs can make it challenging to sequentially remove elements in an efficient way. To demonstrate, considering the following hypothetical dictionary:

var dict = Dictionary<Int, Bool>(minimumCapacity: 5)
dict[3] = true
dict[7] = true
dict[1] = false

To add those three elements, dict performs the following steps:

  1. Uses 3.hashValue to select a bucket, choosing bucket 5.
  2. Stores 3 and true at position 5 in the key and value storage, respectively.
  3. Uses 7.hashValue to select a bucket, choosing bucket 2.
  4. Stores 7 and true at position 2.
  5. Uses 1.hashValue to select a bucket, choosing bucket 5. Collision! Advances to the next open space, bucket 6.
  6. Stores 1 and false at position 6.
    With these three elements, we have a storage layout depicted by the table below—7 and 3 are in their ideal buckets, but 1 is not:

bucket
0
1
2
3
4
5
6
7
key
7
3
1
value
T
T
F
To write an algorithm that removes each element from the dictionary, we would want to do something like this:

var i = dict.startIndex
while i != dict.endIndex {
let next = dict.index(after    : i)
dict.remove(at    : i)
i = next
}

This will remove (7, true) without incident, but when it removes (3, true), the dictionary will need to shift (1, false)back one slot so that it is found in bucket 5. This shift will invalidate the next index that has already been calculated. With the current index invalidation rules, there’s no way to do this efficiently.

Proposed solution

If the remove(at:) method returns the next valid index, this kind of algorithm is possible. remove(at:) already returns the key/value pair that was removed, so this would change the return type to a tuple:

@discardableResult
mutating func remove(at index: Index) ->
    (removed: (key: Key, value: Value), nextIndex: Index)

The code above can be rewritten as the following. When (1, false) is shifted back into bucket 5, there is no problem, since the method can return that same index.

var i = dict.startIndex
while i != dict.endIndex {
(_, i) = dict.remove(at: i)
}

This new capability could be used to implement an efficient in-place filter.


6. A grouped(by:) method for sequences

As a final Dictionary-related issue, grouping elements in a sequence by some computed key is a commonly requested addition that we can add as part of this omnibus proposal. Pass a closure that converts each value in a sequence to a hashable type T; the result of the method is a dictionary with keys of type T and values of type [Iterator.Element].

let names = ["Patti", "Aretha", "Anita", "Gladys"
]
// By first letter
names.grouped(by: { $0.characters.first! })
// ["P": ["Patti"], "G": ["Gladys"], "A": ["Aretha", "Anita"]]

// By name length
names.grouped { $0.utf16.count }
// [5: ["Patti", "Anita"], 6: ["Aretha", "Gladys"]]

7. Apply relevant changes to Set

As the Set and Dictionary types are similar enough to share large chunks of their implementations, it makes sense to look at which of these Dictionary enhancements can also be applied to Set. Of the symbols proposed above, the following additions would also be useful and appropriate for the Set type:

  • Add a Set-specific filter method that returns a new set—this would function essentially as a predicate-based intersection method.
  • Add a capacity property and reserveCapacity() method, for the reasons listed above.
  • Modify remove(at:) to return the index of the next entry, likewise.

Detailed design

With the exception of the proposed capacity property and method, the proposed additions to Dictionary, Set, and Sequence are available in this Swift Sandbox. Note that this prototype is not a proposed implementation; rather a way to try out the behavior of the proposed changes.

Collected in one place, these are the new APIs for Dictionary, Set, and Sequence:

struct Dictionary<Key: Hashable, Value    > {
typealias Element = (key: Key, value: Value
    )
// existing declarations

    /// Creates a new dictionary using the key/value pairs in the given sequence.
    /// If the given sequence has any duplicate keys, the result is `nil`.
    init?<S: Sequence>(_ keysAndValues: S) where S.Iterator.Element == Element

    /// Creates a new dictionary using the key/value pairs in the given sequence,
    /// using a combining closure to determine the value for any duplicate keys.
    init<S: Sequence        >(
merging keysAndValues        : S, resolvingCollisionsWith combine: (Value, Value) throws -> Value
    ) rethrows where S.Iterator.Element == Element

    /// Merges the key/value pairs in the given sequence into the dictionary,
    /// using a combining closure to determine the value for any duplicate keys.
    mutating func merge<S: Sequence        >(
contentsOf other        : S, resolvingCollisionsWith combine: (Value, Value) throws -> Value
    ) rethrows where S.Iterator.Element == Element

    /// Returns a new dictionary created by merging the key/value pairs in the
    /// given sequence into the dictionary, using a combining closure to determine
    /// the value for any duplicate keys.
    func merged<S: Sequence        >(
with other        : S, resolvingCollisionsWith combine: (Value, Value) throws -> Value
    ) rethrows -> [Key: Value] where S.Iterator.Element == Element

    /// Accesses the element with the given key, or the specified default value,
    /// if the dictionary doesn't contain the given key.
    subscript(key: Key, default defaultValue: Value) -> Value { get set }
/// Returns a new dictionary containing the key/value pairs that satisfy
    /// the given predicate.
    func filter(_ isIncluded: (Key, Value) throws -> Bool) rethrows -> [Key: Value
    ]
/// Returns a new dictionary containing the existing keys and the results of
    /// mapping the given closure over the dictionary's values.
    func mapValues<T>(_ transform: (Value) throws -> T) rethrows -> [Key
    : T]
/// The number of key/value pairs that can be stored by the dictionary without
    /// reallocating storage.
    var capacity: Int { get }
/// Ensures that the dictionary has enough storage for `capacity` key/value
    /// pairs.
    var reserveCapacity(_ capacity: Int
    )
/// Removes the key/value pair at the specified index.
    ///
    /// If you use `remove(at:)` while iterating through the contents of a
    /// dictionary, continue iterating using the index returned as `nextIndex`.
    /// Calling this method invalidates all other previously existing indices.
    ///
    /// - Returns: A tuple containing the removed key/value pair and the index
    ///   of the next pair in the dictionary.
    @discardableResult
    mutating func remove(at index: Index) ->
        (removed: (key: Key, value: Value), nextIndex: Index
)
}
extension Sequence {
/// Returns a dictionary where the keys are the groupings returned by
    /// the given closure and the values are arrays of the elements that
    /// returned each specific key.
    func grouped<Key: Hashable        >(
by grouping: (Iterator.Element) throws -> Key
    ) rethrows -> [Key: [Iterator.Element
]]
}
struct Set<Element: Hashable    > {
// existing declarations

    /// Returns a new set containing the elements that satisfy the given predicate.
    func filter(_ isIncluded: (Element) throws -> Bool) rethrows -> Set

    /// The number of elements that can be stored by the set without
    /// reallocating storage.
    var capacity: Int { get }
/// Ensures that the set has enough storage for `capacity` elements.
    var reserveCapacity(_ capacity: Int
    )
/// Removes the element at the specified index.
    ///
    /// If you use `remove(at:)` while iterating through the contents of a
    /// set, continue iterating using the index returned as `nextIndex`.
    /// Calling this method invalidates all other previously existing indices.
    ///
    /// - Returns: A tuple containing the removed element and the index
    ///   of the next element in the set.
    @discardableResult
    mutating func remove(at index: Index) -> (removed: Element, nextIndex: Index
)
}
/// Returns its first argument.
func first<T>(_ a: T, _ b: T) -> T
/// Returns its last argument.
func last<T>(_ a: T, _ b: T) -> T

Source compatibility

A significant majority of the proposed additions are purely additive and should impose no source compatibility burden. The modified return return type for the two remove(at:) methods, while additive in nature, will break source compatibility in the cases where the removed value is captured. In theory, a fix-it should be possible that would help in these cases.


(Sean Heber) #2

Oh yeah. Huge +1 from me as I’ve had to implement most of these things myself more than once.

Some notes:

grouped(by:) seems to be missing from the detailed design section and I think it would apply to Set as well, but it’s not mentioned there either.

The changes for remove(at:) seem odd since the motivation seems to be to remove all elements in the dictionary quickly, but Dictionary already has a removeAll(keepingCapacity:) function. I may be missing something.

l8r
Sean

···

On Mar 31, 2017, at 8:28 AM, Nate Cook via swift-evolution <swift-evolution@swift.org> wrote:

Hello everyone!

Here is a draft proposal to fill in some usability gaps in the standard library Dictionary and Set types. I'd love to hear any feedback from the list before submitting this as a PR.

Thanks!
Nate

Dictionary & Set Enhancements

  • Proposal: SE-0000
  • Author: Nate Cook
  • Review Manager: TBD
  • Status: Awaiting review

Introduction

This proposal comprises a variety of commonly (and less commonly) suggested improvements to the standard library's Dictionary type, from merging initializers to dictionary-specific filter and mapValues methods. The proposed additions to Dictionary, and the corresponding changes to Set, are detailed in the sections below.

  • Suggested Improvements:
    • Merging initializers and methods
    • Key-based subscript with default value
    • Dictionary-specific map and filter
    • Visible Dictionary capacity
    • Dictionary.remove(at:) returns next index
    • A grouped(by:) method for sequences
    • Relevant changes to Set
  • Detailed Design
    • Sandbox with Prototype
  • Source Compatibility

1. Merging initializers and methods

The Dictionary type should allow initialization from a sequence of (Key, Value) tuples and offer methods that merge a sequence of (Key, Value) tuples into a new or existing dictionary, using a closure to combine values for duplicate keys.

  • First message of discussion thread
  • Initial proposal draft
  • Prior standalone proposal (SE-100)
Array and Set both have initializers that create a new instance from a sequence of elements. The Array initializer is useful for converting other sequences and collections to the "standard" collection type, while the Set initializer is essential for recovering set operations after performing any functional operations on a set. For example, filtering a set produces a collection without any set operations available.

let numberSet = Set(1 ... 100
)

let fivesOnly = numberSet.lazy.filter { $0 % 5 == 0 }
fivesOnly is a LazyFilterCollection<Set<Int>> instead of a Set—sending that back through the Set sequence initializer restores the expected methods.

let fivesOnlySet = Set(numberSet.lazy.filter { $0 % 5 == 0
})
fivesOnlySet.
isSubsetOf(numberSet) // true
Dictionary, on the other hand, has no such initializer, so a similar operation leaves no room to recover dictionary functionality without building a mutable Dictionary via iteration or functional methods. These techniques also don't support type inference from the source sequence, increasing verbosity.

let numberDictionary = ["one": 1, "two": 2, "three": 3, "four": 4
]

let evenOnly = numberDictionary.lazy.filter { (_, value) in

    value
% 2 == 0

}
    
var viaIteration: [String: Int] = [:
]

for (key, value) in
evenOnly {
    viaIteration[key]

value
}
    
let viaReduce: [String: Int] = evenOnly.reduce([:]) { (cumulative, kv) in

var dict =
cumulative
    dict[kv.
key] = kv.value

return
dict
}

Beyond initialization, Array and Set both also provide a method to add a new block of elements to an existing collection. Array provides this via append(contentsOf:) for the common appending case or replaceSubrange(_:with:) for general inserting or replacing, while the unordered Set type lets you pass any sequence to unionInPlace(_:slight_smile: to add elements to an existing set.

Once again, Dictionary has no corresponding API -- looping and adding elements one at a time as shown above is the only way to merge new elements into an existing dictionary.

Proposed solution

This proposal puts forward two new ways to convert (Key, Value) sequences to dictionary form: a full-width, failable initializer and a set of merging APIs that handle input data with duplicate keys.

Sequence-based initializer

The proposed solution would add a new, failable initializer to Dictionary that accepts any sequence of (Key, Value)tuple pairs.

init?<S: Sequence where S.Iterator.Element == (key: Key, value: Value
)>(
    
_ keysAndValues: S)
Instead of the techniques for recovering a Dictionary instance shown above, the proposed initializer would allow a much cleaner syntax.

let viaProposed = Dictionary(evenOnly)!
Like Array.init(_:slight_smile: and Set.init(_:), this is a full-width initializer. To ensure this, the initializer requires that each key in the supplied sequence is unique, and returns nil whenever that condition isn't met. This model prevents accidentally dropping values for keys that might be duplicated, but allows easier recovery than the trap that results from duplicate keys in a dictionary literal.

The new initializer allows for some convenient uses that aren't currently possible.

  • Initializing from a DictionaryLiteral (the type, not an actual literal)

let literal: DictionaryLiteral = ["a": 1, "b": 2, "c": 3, "d": 4
]

let dictFromDL = Dictionary(literal)!
  • Swapping keys and values of an existing dictionary

guard let reversedDict = Dictionary(dictFromDL.map { ($1, $0
) })

else { throw Errors.reversalFailed
}

// [2: "b", 4: "d", 1: "a", 3: "c"]
  • Converting an array to an indexed dictionary (popular on the thread)

let names = ["Cagney", "Lacey", "Bensen"
]

let dict = Dictionary(names.enumerated().map { (i, val) in (i + 1, val) })!
// [2: "Lacey", 3: "Bensen", 1: "Cagney"]
  • Initializing from a pair of zipped sequences (examples abound)

let letters = "abcdef".characters.lazy.map(String.init
)

let dictFromZip = Dictionary(zip(letters, 1...10))!
// ["b": 2, "e": 5, "a": 1, "f": 6, "d": 4, "c": 3]
This particular use is currently blocked by SR-922. As a workaround, add .map {(key: $0, value: $1)}.

Merging initializer and methods

Creating a Dictionary from a dictional literal currently checks the keys for uniqueness, trapping on a duplicate. The sequence-based initializer shown above has the same requirements, failing and returning nil when encountering duplicate keys.

let duplicates: DictionaryLiteral = ["a": 1, "b": 2, "a": 3, "b": 4
]

let letterDict = Dictionary
(duplicates)

// nil
However, some use cases can be forgiving of duplicate keys, so this proposal includes a second new initializer. This initializer allows the caller to supply, along with the sequence, a combining closure that's called with the old and new values for any duplicate keys.

init<S: Sequence
>(
    
merging keysAndValues
: S,
    
resolvingCollisionsWith combine: (Value, Value) throws -> Value

)
rethrows where S.Iterator.Element == (key: Key, value: Value)
This example shows how one could keep the first value of all those supplied for a duplicate key.

let letterDict2 = Dictionary(merging: duplicates, resolvingCollisionsWith: { (first, _) in
first })

// ["b": 2, "a": 1]
Or the largest value for any duplicate keys.

let letterDict3 = Dictionary(merging: duplicates, resolvingCollisionsWith
: max)

// ["b": 4, "a": 3]
At other times the merging initializer could be used to combine values for duplicate keys. Donnacha Oisín Kidney wrote a neat frequencies() method for sequences as an example of such a use in the thread.

extension Sequence where Iterator.Element: Hashable
{
    
func frequencies() -> [Iterator.Element: Int
] {
        
return Dictionary(merging: self.lazy.map { v in (v, 1) }, resolvingCollisionsWith: +
)
    }
}
[
1, 2, 2, 3, 1, 2, 4, 5, 3, 2, 3, 1].frequencies
()

// [2: 4, 4: 1, 5: 1, 3: 3, 1: 3]
This proposal also includes new mutating and non-mutating methods for Dictionary that merge the contents of a sequence of (Key, Value) tuples into an existing dictionary, merge(contentsOf:) and merged(with:).

mutating func merge<S: Sequence
>(
    
contentsOf other
: S,
    
resolvingCollisionsWith combine: (Value, Value) throws -> Value

)
rethrows where S.Iterator.Element == (key: Key, value: Value
)

func merged<S: Sequence
>(
    
with other
: S,
    
resolvingCollisionsWith combine: (Value, Value) throws -> Value

)
rethrows -> [Key: Value] where S.Iterator.Element == (key: Key, value: Value)
As above, there are a wide variety of uses for the merge.

// Adding default values
let defaults: [String: Bool] = ["foo": false, "bar": false, "baz": false
]

var options: [String: Bool] = ["foo": true, "bar": false
]
options.
merge(contentsOf: defaults) { (old, _) in
old }

// options is now ["foo": true, "bar": false, "baz": false]
    
// Summing counts repeatedly
var bugCounts: [String: Int] = ["bees": 9, "ants": 112, ...
]

while bugCountingSource.hasMoreData
() {
    bugCounts.
merge(contentsOf: bugCountingSource.countMoreBugs(), resolvingCollisionsWith: +
)
}

To simplify two common uses of the merging initializer, this proposal includes the addition of two new top-level functions to the standard library: first(_:_:slight_smile: and last(_:_:), which return their first and last arguments, respectively. These new functions can be passed instead of a custom closure:

let firstWins = Dictionary(merging: duplicates, resolvingCollisionsWith
: first)

// ["b": 2, "a": 1]

let lastWins = Dictionary(merging: duplicates, resolvingCollisionsWith
: last)

// ["b": 4, "a": 3]

Alternatives Considered

The first(_:_:slight_smile: and last(_:_:slight_smile: functions may not pull their weight as top-level symbols. Instead, at the cost of additional overloads for the merging initializer and methods, we could allow users to specify the useFirst and useLast cases of a MergeCollisionStrategy enumeration.

extension Dictionary
{
    
/// The strategy to use when merging a sequence of key-value pairs into a dictionary.
    enum MergeCollisionStrategy
{
        
/// If there is more than one instance of a key in the sequence to merge, use
        /// only the first value for the dictionary.
        case useFirst

/// If there is more than one instance of a key in the sequence to merge, use
        /// only the last value for the dictionary.
        case useLast

    }
    
init<S: Sequence
>(
        
merging keysAndValues
: S,
        
resolvingCollisionsWith strategy
: MergeCollisionStrategy
    )
    
// other merging overloads
}
In use, this overload would look similar to the functional version, but may aid in discoverability:

let firstWins = Dictionary(merging: duplicates, resolvingCollisionsWith: .useFirst
)

// ["b": 2, "a": 1]

let lastWins = Dictionary(merging: duplicates, resolvingCollisionsWith: .useLast
)

// ["b": 4, "a": 3]

2. Key-based subscript with default value

Another common challenge with dictionaries is iteratively making changes to key/value pairs that may or may not already be present. For example, to iteratively add count the frequencies of letters in a string, one might write something like the following:

let source = "how now brown cow"
var frequencies: [Character: Int] = [:
]

for c in source.characters
{
    
if frequencies[c] == nil
{
        frequencies[c]
= 1

    }
else
{
        frequencies[c]
! += 1

    }
}

Testing for nil and assigning through the force unwrapping operator are awkward at best for such a common operation. Furthermore, the Optional<Value> return type of the current keyed subscript complicates efficiencies that could be gained for this type of modify action under a future ownership model.

Proposed solution

A keyed subscript with a default value neatly simplifies this usage. Instead of checking for nil, one can pass the default value along with the key as a default subscript parameter.

let source = "how now brown cow"
var frequencies: [Character: Int] = [:
]

for c in source.characters
{
    frequencies[c,
default: 0] += 1

}

The return type of this subscript is a non-optional Value. Note that accessing the subscript as a getter never stores the default value in the dictionary—the following two lines are equivalent:

let x = frequencies["a", default: 0
]

let y = frequencies["a"] ?? 0

3. Dictionary-specific map and filter

The standard map and filter methods, while always useful and beloved, aren't ideal when applied to dictionaries. In both cases, the desired end-result is frequently another dictionary instead of an array of key-value pairs—even with the sequence-based initializer proposed above this is an inefficient way of doing things.

Additionally, the standard map method doesn't gracefully support passing a function when transforming only the values of a dictionary. The transform function must accept and return key/value pairs, necessitating a custom closure in nearly every case.

Assuming the addition of a sequence-based initializer, the current filter and map look like the following:

let numbers = ["one": 1, "two": 2, "three": 3, "four": 4
]

let evens = Dictionary(numbers.lazy.filter { $0.value % 2 == 0 })!
// ["four": 4, "two": 2]
let strings = Dictionary(numbers.lazy.map { (k, v) in (k, String(v)) })!
// ["three": "3", "four": "4", "one": "1", "two": "2"]

Proposed solution

This proposal adds two new methods for Dictionary:

  • A mapValues method that keeps a dictionary's keys intact while transforming the values. Mapping a dictionary's key/value pairs can't always produce a new dictionary, due to the possibility of key collisions, but mapping only the values can produce a new dictionary with the same underlying layout as the original.

let strings = numbers.mapValues(String.init
)

// ["three": "3", "four": "4", "one": "1", "two": "2"]
  • A Dictionary-returning filter method. While transforming the keys and values of a dictionary can result in key collisions, filtering the elements of a dictionary can at worst replicate the entire dictionary.

let evens = numbers.filter { $0.value % 2 == 0
}

// ["four": 4, "two": 2]
Both of these can be made significantly more efficient than their Sequence-sourced counterparts. For example, the mapValues method can simply copy the portion of the storage that holds the keys to the new dictionary before transforming the values.

4. Visible dictionary capacity

As you add elements to a dictionary, it automatically grows its backing storage as necessary. This reallocation is a significant operation—unlike arrays, where the existing elements can be copied to a new block of storage en masse, every key/value pair must be moved over individually, recalculating the hash value for the key to find its position in the larger backing buffer.

While dictionaries uses an exponential growth strategy to make this as efficient as possible, beyond the init(minimumCapacity:) initializer they do not expose a way to reserve a specific capacity. In addition, adding a key/value pair to a dictionary is guaranteed not to invalidate existing indices as long as the capacity doesn't change, yet we don't provide any way of seeing a dictionary's current or post-addition capacity.

Proposed solution

Dictionary should add a capacity property and a reserveCapacity(_:slight_smile: method, like those used in range-replaceable collections. The capacity of a dictionary is the number of elements it can hold without reallocating a larger backing storage, while calling reserveCapacity(_:slight_smile: allocates a large enough buffer to hold the requested number of elements without reallocating.

var numbers = ["one": 1, "two": 2, "three": 3, "four": 4
]
numbers.
capacity // 6
numbers.reserveCapacity(20
)
numbers.
capacity // 24
Because hashed collections use extra storage capacity to reduce the likelihood and cost of collisions, the value of the capacity property won't be equal to the actual size of the backing storage. Likewise, the capacity after calling reserveCapacity(_:slight_smile: will be at least as large as the argument, but usually larger. (In its current implementation, Dictionary always has a power of 2-sized backing storage.)

5. Dictionary.remove(at:) returns next index

The method dictionaries use to store their key/value pairs can make it challenging to sequentially remove elements in an efficient way. To demonstrate, considering the following hypothetical dictionary:

var dict = Dictionary<Int, Bool>(minimumCapacity: 5
)
dict[
3] = true

dict[
7] = true

dict[
1] = false
To add those three elements, dict performs the following steps:

  • Uses 3.hashValue to select a bucket, choosing bucket 5.
  • Stores 3 and true at position 5 in the key and value storage, respectively.
  • Uses 7.hashValue to select a bucket, choosing bucket 2.
  • Stores 7 and true at position 2.
  • Uses 1.hashValue to select a bucket, choosing bucket 5. Collision! Advances to the next open space, bucket 6.
  • Stores 1 and false at position 6.
With these three elements, we have a storage layout depicted by the table below—7 and 3 are in their ideal buckets, but 1 is not:

bucket 0 1 2 3 4 5 6 7
key 7 3 1
value T T F
To write an algorithm that removes each element from the dictionary, we would want to do something like this:

var i = dict.startIndex
while i != dict.endIndex
{
    
let next = dict.index(after
: i)
    dict.
remove(at
: i)
    i

next
}

This will remove (7, true) without incident, but when it removes (3, true), the dictionary will need to shift (1, false)back one slot so that it is found in bucket 5. This shift will invalidate the next index that has already been calculated. With the current index invalidation rules, there's no way to do this efficiently.

Proposed solution

If the remove(at:) method returns the next valid index, this kind of algorithm is possible. remove(at:) already returns the key/value pair that was removed, so this would change the return type to a tuple:

@discardableResult
mutating func remove(at index: Index) ->

    (
removed: (key: Key, value: Value), nextIndex: Index)
The code above can be rewritten as the following. When (1, false) is shifted back into bucket 5, there is no problem, since the method can return that same index.

var i = dict.startIndex
while i != dict.endIndex
{
    (
_, i) = dict.remove(at
: i)
}

This new capability could be used to implement an efficient in-place filter.

6. A grouped(by:) method for sequences

As a final Dictionary-related issue, grouping elements in a sequence by some computed key is a commonly requested addition that we can add as part of this omnibus proposal. Pass a closure that converts each value in a sequence to a hashable type T; the result of the method is a dictionary with keys of type T and values of type [Iterator.Element].

let names = ["Patti", "Aretha", "Anita", "Gladys"
]

// By first letter
names.grouped(by: { $0.characters.first!
})

// ["P": ["Patti"], "G": ["Gladys"], "A": ["Aretha", "Anita"]]

// By name length
names.grouped { $0.utf16.count
}

// [5: ["Patti", "Anita"], 6: ["Aretha", "Gladys"]]

7. Apply relevant changes to Set

As the Set and Dictionary types are similar enough to share large chunks of their implementations, it makes sense to look at which of these Dictionary enhancements can also be applied to Set. Of the symbols proposed above, the following additions would also be useful and appropriate for the Set type:

  • Add a Set-specific filter method that returns a new set—this would function essentially as a predicate-based intersection method.
  • Add a capacity property and reserveCapacity() method, for the reasons listed above.
  • Modify remove(at:) to return the index of the next entry, likewise.

Detailed design

With the exception of the proposed capacity property and method, the proposed additions to Dictionary, Set, and Sequence are available in this Swift Sandbox. Note that this prototype is not a proposed implementation; rather a way to try out the behavior of the proposed changes.

Collected in one place, these are the new APIs for Dictionary, Set, and Sequence:

struct Dictionary<Key: Hashable, Value
> {
    
typealias Element = (key: Key, value: Value
)
    
// existing declarations
    
/// Creates a new dictionary using the key/value pairs in the given sequence.
    /// If the given sequence has any duplicate keys, the result is `nil`.
    init?<S: Sequence>(_ keysAndValues: S) where S.Iterator.Element == Element

/// Creates a new dictionary using the key/value pairs in the given sequence,
    /// using a combining closure to determine the value for any duplicate keys.
    init<S: Sequence
>(
        
merging keysAndValues
: S,
        
resolvingCollisionsWith combine: (Value, Value) throws -> Value

    )
rethrows where S.Iterator.Element == Element

/// Merges the key/value pairs in the given sequence into the dictionary,
    /// using a combining closure to determine the value for any duplicate keys.
    mutating func merge<S: Sequence
>(
        
contentsOf other
: S,
        
resolvingCollisionsWith combine: (Value, Value) throws -> Value

    )
rethrows where S.Iterator.Element == Element

/// Returns a new dictionary created by merging the key/value pairs in the
    /// given sequence into the dictionary, using a combining closure to determine
    /// the value for any duplicate keys.
    func merged<S: Sequence
>(
        
with other
: S,
        
resolvingCollisionsWith combine: (Value, Value) throws -> Value

    )
rethrows -> [Key: Value] where S.Iterator.Element == Element

/// Accesses the element with the given key, or the specified default value,
    /// if the dictionary doesn't contain the given key.
    subscript(key: Key, default defaultValue: Value) -> Value { get set
}
    
/// Returns a new dictionary containing the key/value pairs that satisfy
    /// the given predicate.
    func filter(_ isIncluded: (Key, Value) throws -> Bool) rethrows -> [Key: Value
]
    
/// Returns a new dictionary containing the existing keys and the results of
    /// mapping the given closure over the dictionary's values.
    func mapValues<T>(_ transform: (Value) throws -> T) rethrows -> [Key
: T]

/// The number of key/value pairs that can be stored by the dictionary without
    /// reallocating storage.
    var capacity: Int { get
}
    
/// Ensures that the dictionary has enough storage for `capacity` key/value
    /// pairs.
    var reserveCapacity(_ capacity: Int
)
    
/// Removes the key/value pair at the specified index.
    ///
    /// If you use `remove(at:)` while iterating through the contents of a
    /// dictionary, continue iterating using the index returned as `nextIndex`.
    /// Calling this method invalidates all other previously existing indices.
    ///
    /// - Returns: A tuple containing the removed key/value pair and the index
    /// of the next pair in the dictionary.
    @discardableResult

mutating func remove(at index: Index) ->

        (
removed: (key: Key, value: Value), nextIndex: Index
)
}

extension Sequence
{
    
/// Returns a dictionary where the keys are the groupings returned by
    /// the given closure and the values are arrays of the elements that
    /// returned each specific key.
    func grouped<Key: Hashable
>(
        
by grouping: (Iterator.Element) throws -> Key

    )
rethrows -> [Key: [Iterator.Element
]]
}

struct Set<Element: Hashable
> {
    
// existing declarations

/// Returns a new set containing the elements that satisfy the given predicate.
    func filter(_ isIncluded: (Element) throws -> Bool) rethrows -> Set

/// The number of elements that can be stored by the set without
    /// reallocating storage.
    var capacity: Int { get
}
    
/// Ensures that the set has enough storage for `capacity` elements.
    var reserveCapacity(_ capacity: Int
)
    
/// Removes the element at the specified index.
    ///
    /// If you use `remove(at:)` while iterating through the contents of a
    /// set, continue iterating using the index returned as `nextIndex`.
    /// Calling this method invalidates all other previously existing indices.
    ///
    /// - Returns: A tuple containing the removed element and the index
    /// of the next element in the set.
    @discardableResult

mutating func remove(at index: Index) -> (removed: Element, nextIndex: Index
)
}

/// Returns its first argument.
func first<T>(_ a: T, _ b: T) ->
T

/// Returns its last argument.
func last<T>(_ a: T, _ b: T) -> T

Source compatibility

A significant majority of the proposed additions are purely additive and should impose no source compatibility burden. The modified return return type for the two remove(at:) methods, while additive in nature, will break source compatibility in the cases where the removed value is captured. In theory, a fix-it should be possible that would help in these cases.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Xiaodi Wu) #3

<snip>

Nate, in general, I like these changes very much. They would definitely be
worthwhile additions. The two things I'd like to comment on are as follows:

First, I wonder whether either `first(_:_:)` and `last(_:_:)` or the
proposed alternative holds their weight. Neither is more succinct than `{
$0 }` or `{ $1 }`, which does what it says very clearly *and* don't require
any discovery. I don't think users, in the moment, would be stumped as to
how to write such a closure, nor that any reader would be confused about
what the closure does:

let firstWins = Dictionary(merging: duplicates, resolvingCollisionsWith: {
$0 })

Second, although the proposed `Dictionary.filter` would be an improvement
on its own, I wonder if it is as valuable after implementation of part 1.
That is, with only implementation of part 1, you'd have the ability to
write `Dictionary(numbers.filter {...})!` to get a dictionary after filter.

There are optimization opportunities left on the floor here, certainly, but
I'd expect `Dictionary(numbers.lazy.filter {...})!` could recover those
opportunities if it matters. I just worry about the API confusion that
comes from two `filter` operations that return different types, the impact
on current source, as well as its inconsistency with `map` and `reduce`.

···

On Fri, Mar 31, 2017 at 8:28 AM, Nate Cook via swift-evolution < swift-evolution@swift.org> wrote:

Hello everyone!

Here is a draft proposal to fill in some usability gaps in the standard
library Dictionary and Set types. I'd love to hear any feedback from the
list before submitting this as a PR.

Thanks!
Nate


(Zachary Waldowski) #4

Snipped quoted proposal so as to not break the mailing list… By and
large, I'm in favor. I like the potential for bringing up Dictionary and
Set up to the implementation quality of Array and String.

I don't think `init(merging:resolvingCollisionsWith:)` et al hold their
weight for inclusion in the language, unless some notable optimization
opportunity can be surfaced out of it. Users frustrated at the lack of a
`merge` in Swift want it to be opinionated and "just do the right
thing," i.e., the same way as whatever other language they most recently
used. A properly annotated and indented-to-be-beautiful use of the closure-
based version will take up the same amount of code as doing the merge by
hand with your custom condition.

Sincerely,

  Zachary Waldowski

  zach@waldowski.me


(David Hart) #5

Hello,

This proposal is looking great! But we need to move fast as the time for proposal is pretty much over. Just a few points:

SR-922

Just wanted to point out again what Nate alluded to in the proposal: all functions which take as argument a Sequence can cause weird compilation errors because of the way the type-system handles tuple labels:

let a = [1, 2, 3, 4, 5]
let b = a.map({ ($0.description, $0 * 2) })
let c = Dictionary(b)
// error: generic parameter 'Key' could not be inferred

And requires specifying the tuple labels to compile:

let a = [1, 2, 3, 4, 5]
let b = a.map({ (key: $0.description, value: $0 * 2) })
let c = Dictionary(b)

There is a bug report about this issue SR-922 <https://bugs.swift.org/browse/SR-922> I had missed it when reading the proposal initially, so just wanted to point people to it again.

first and last

I also don’t think that first and last don’t pull their weight as global functions and I’m not a fan of MergeCollisionStrategy either. I’m perfectly okay with using { $0 } and { $1 }.

Thanks!
David.


(Jason Gregori) #6

I really like the merging methods and have already needed to write my own.
Zach, do you mind showing a comparison of what you're thinking?

Nate, do you mind throwing this up in a gist or something? My email client
isn't letting me see the whole thing.

Thanks,
Jason

···

On Fri, Mar 31, 2017 at 11:52 AM Zach Waldowski via swift-evolution < swift-evolution@swift.org> wrote:

Snipped quoted proposal so as to not break the mailing list… By and large,
I'm in favor. I like the potential for bringing up Dictionary and Set up to
the implementation quality of Array and String.

I don't think `init(merging:resolvingCollisionsWith:)` et al hold their
weight for inclusion in the language, unless some notable optimization
opportunity can be surfaced out of it. Users frustrated at the lack of a
`merge` in Swift want it to be opinionated and "just do the right thing,"
i.e., the same way as whatever other language they most recently used. A
properly annotated and indented-to-be-beautiful use of the closure-based
version will take up the same amount of code as doing the merge by hand
with your custom condition.

Sincerely,
  Zachary Waldowski
  zach@waldowski.me

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Nate Cook) #7

Yep, here it is -- didn't realize the length would cause such problems!

  https://gist.github.com/natecook1000/4bd8f20736eb99ed3c5a4cdcc41e9a5f

Nate

···

On Apr 1, 2017, at 12:01 PM, Jason Gregori via swift-evolution <swift-evolution@swift.org> wrote:

I really like the merging methods and have already needed to write my own. Zach, do you mind showing a comparison of what you're thinking?

Nate, do you mind throwing this up in a gist or something? My email client isn't letting me see the whole thing.

Thanks,
Jason

On Fri, Mar 31, 2017 at 11:52 AM Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Snipped quoted proposal so as to not break the mailing list… By and large, I'm in favor. I like the potential for bringing up Dictionary and Set up to the implementation quality of Array and String.

I don't think `init(merging:resolvingCollisionsWith:)` et al hold their weight for inclusion in the language, unless some notable optimization opportunity can be surfaced out of it. Users frustrated at the lack of a `merge` in Swift want it to be opinionated and "just do the right thing," i.e., the same way as whatever other language they most recently used. A properly annotated and indented-to-be-beautiful use of the closure-based version will take up the same amount of code as doing the merge by hand with your custom condition.

Sincerely,
  Zachary Waldowski
  zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Ricardo Parada) #8

It looks very useful. I did not quite get the role of the DictionaryLiteral but everything looks very nice.

···

On Apr 1, 2017, at 3:58 PM, Nate Cook via swift-evolution <swift-evolution@swift.org> wrote:

Yep, here it is -- didn't realize the length would cause such problems!

  https://gist.github.com/natecook1000/4bd8f20736eb99ed3c5a4cdcc41e9a5f

Nate

On Apr 1, 2017, at 12:01 PM, Jason Gregori via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I really like the merging methods and have already needed to write my own. Zach, do you mind showing a comparison of what you're thinking?

Nate, do you mind throwing this up in a gist or something? My email client isn't letting me see the whole thing.

Thanks,
Jason

On Fri, Mar 31, 2017 at 11:52 AM Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Snipped quoted proposal so as to not break the mailing list… By and large, I'm in favor. I like the potential for bringing up Dictionary and Set up to the implementation quality of Array and String.

I don't think `init(merging:resolvingCollisionsWith:)` et al hold their weight for inclusion in the language, unless some notable optimization opportunity can be surfaced out of it. Users frustrated at the lack of a `merge` in Swift want it to be opinionated and "just do the right thing," i.e., the same way as whatever other language they most recently used. A properly annotated and indented-to-be-beautiful use of the closure-based version will take up the same amount of code as doing the merge by hand with your custom condition.

Sincerely,
  Zachary Waldowski
  zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Ricardo Parada) #9

Some thoughts after reviewing it a second time:

I am not sure I like having global functions first(_:_:slight_smile: and last(_:_:). Would it be possible to default the closure to do what last(_:_:slight_smile: would do? I'm assuming the most popular thing to do is to have the values from the second dictionary prevail.

// These two would be equivalent to passing last(_:_:slight_smile: function
let newDict = dict1.merged(with: dict2)
let newDict = dict1.merged(with: dict2) { $1 }

If someone wanted to do what first(_:_:slight_smile: would do, or some other algorithm then it would be quite simple, wouldn't it? For example:

// This would be equivalent to passing in the first(_:_) function
let newDict = dict1.merged(with: dict2) { $0 }

// Combine values of common keys by adding them
let newDict = dict1.merged(with: dict2) { $0 + $1 }

···

On Apr 1, 2017, at 3:58 PM, Nate Cook via swift-evolution <swift-evolution@swift.org> wrote:

Yep, here it is -- didn't realize the length would cause such problems!

  https://gist.github.com/natecook1000/4bd8f20736eb99ed3c5a4cdcc41e9a5f

Nate

On Apr 1, 2017, at 12:01 PM, Jason Gregori via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I really like the merging methods and have already needed to write my own. Zach, do you mind showing a comparison of what you're thinking?

Nate, do you mind throwing this up in a gist or something? My email client isn't letting me see the whole thing.

Thanks,
Jason

On Fri, Mar 31, 2017 at 11:52 AM Zach Waldowski via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Snipped quoted proposal so as to not break the mailing list… By and large, I'm in favor. I like the potential for bringing up Dictionary and Set up to the implementation quality of Array and String.

I don't think `init(merging:resolvingCollisionsWith:)` et al hold their weight for inclusion in the language, unless some notable optimization opportunity can be surfaced out of it. Users frustrated at the lack of a `merge` in Swift want it to be opinionated and "just do the right thing," i.e., the same way as whatever other language they most recently used. A properly annotated and indented-to-be-beautiful use of the closure-based version will take up the same amount of code as doing the merge by hand with your custom condition.

Sincerely,
  Zachary Waldowski
  zach@waldowski.me <mailto:zach@waldowski.me>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Zachary Waldowski) #10

Right-favoring:

    for (key, value) in rhs {

        lhs[key] = value

    }

Left-favoring:

    for (key, value) in rhs where lhs[key] == nil {

        lhs[key] = value

    }

Throws an error:

    for (key, value) in rhs {

        guard lhs[key] == nil else { throw Foo() }

        lhs[key] = value

    }

The barrier to inclusion in the stdlib should not merely be if someone's
had to write it before, otherwise there's no reason not to include your
Instagram-but-for-cats client full-on in the OS.

Doing these as needed yourself are not unreasonably complex or error-
prone — unlike, say, Unicode — but they _are_ pretty domain specific.

I'm personally not in favor of additions to the standard library that
have an associated Options or Strategy type. "Well," you may respond,
"we don't want to add a bunch of overloads, either! That'd be
confusing!" And that's exactly what I'm getting at - either way you
slice it, every addition to the stdlib has a concrete impact on the
mental model of the language.

Moreover, I am wary of language additions that can't (or we don't want
to) be modeled in terms of Collection, because we risk limiting the
growth of the latter as we get better generics features.

Best,

  Zachary Waldowski

  zach@waldowski.me

···

On Sat, Apr 1, 2017, at 12:01 PM, Jason Gregori via swift-evolution wrote:

I really like the merging methods and have already needed to write my
own. Zach, do you mind showing a comparison of what you're thinking?

Nate, do you mind throwing this up in a gist or something? My email
client isn't letting me see the whole thing.

Thanks,

Jason

On Fri, Mar 31, 2017 at 11:52 AM Zach Waldowski via swift-evolution > <swift-evolution@swift.org> wrote:

__

Snipped quoted proposal so as to not break the mailing list… By and
large, I'm in favor. I like the potential for bringing up Dictionary
and Set up to the implementation quality of Array and String.

I don't think `init(merging:resolvingCollisionsWith:)` et al hold
their weight for inclusion in the language, unless some notable
optimization opportunity can be surfaced out of it. Users frustrated
at the lack of a `merge` in Swift want it to be opinionated and "just
do the right thing," i.e., the same way as whatever other language
they most recently used. A properly annotated and indented-to-be-
beautiful use of the closure-based version will take up the same
amount of code as doing the merge by hand with your custom condition.

Sincerely,

  Zachary Waldowski

  zach@waldowski.me

_______________________________________________

swift-evolution mailing list

swift-evolution@swift.org

https://lists.swift.org/mailman/listinfo/swift-evolution

_________________________________________________

swift-evolution mailing list

swift-evolution@swift.org

https://lists.swift.org/mailman/listinfo/swift-evolution


(Jason Gregori) #11

Thanks. I agree that adding first, last, and merge strategy seems like overkill. But I still really like having the merge methods. For me it's something that comes up often on almost every project I work on.

···

On Apr 3, 2017, at 8:18 AM, Zach Waldowski via swift-evolution <swift-evolution@swift.org> wrote:

Right-favoring:

    for (key, value) in rhs {
        lhs[key] = value
    }

Left-favoring:

    for (key, value) in rhs where lhs[key] == nil {
        lhs[key] = value
    }

Throws an error:

    for (key, value) in rhs {
        guard lhs[key] == nil else { throw Foo() }
        lhs[key] = value
    }

The barrier to inclusion in the stdlib should not merely be if someone's had to write it before, otherwise there's no reason not to include your Instagram-but-for-cats client full-on in the OS.

Doing these as needed yourself are not unreasonably complex or error-prone — unlike, say, Unicode — but they _are_ pretty domain specific.

I'm personally not in favor of additions to the standard library that have an associated Options or Strategy type. "Well," you may respond, "we don't want to add a bunch of overloads, either! That'd be confusing!" And that's exactly what I'm getting at - either way you slice it, every addition to the stdlib has a concrete impact on the mental model of the language.

Moreover, I am wary of language additions that can't (or we don't want to) be modeled in terms of Collection, because we risk limiting the growth of the latter as we get better generics features.

Best,
  Zachary Waldowski
  zach@waldowski.me

On Sat, Apr 1, 2017, at 12:01 PM, Jason Gregori via swift-evolution wrote:
I really like the merging methods and have already needed to write my own. Zach, do you mind showing a comparison of what you're thinking?

Nate, do you mind throwing this up in a gist or something? My email client isn't letting me see the whole thing.

Thanks,
Jason

On Fri, Mar 31, 2017 at 11:52 AM Zach Waldowski via swift-evolution <swift-evolution@swift.org> wrote:

Snipped quoted proposal so as to not break the mailing list… By and large, I'm in favor. I like the potential for bringing up Dictionary and Set up to the implementation quality of Array and String.

I don't think `init(merging:resolvingCollisionsWith:)` et al hold their weight for inclusion in the language, unless some notable optimization opportunity can be surfaced out of it. Users frustrated at the lack of a `merge` in Swift want it to be opinionated and "just do the right thing," i.e., the same way as whatever other language they most recently used. A properly annotated and indented-to-be-beautiful use of the closure-based version will take up the same amount of code as doing the merge by hand with your custom condition.

Sincerely,
  Zachary Waldowski
  zach@waldowski.me

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution