Proposal: Python's list, generator, and dictionary comprehensions

Jumhyn · October 22, 2022, 6:25pm

If further discussion is desired on this topic would folks mind moving it to a new thread? Previous participants from back in 2015 may be getting unwanted notifications/emails from this thread.

Avi · October 22, 2022, 6:28pm

Swift has class inheritance because it was designed to inter-op with Objective-C. Hard to call ObjC a popular language.

Beyond that, I think you're being hyper-literal just to argue a point. If I had written that Swift does not take every feature from every popular language merely because it's popular, would that have been better understood?

NIH = Not Invented Here. It's an excuse often cited for why a group reinvents the metaphorical wheel.

AlexanderM · October 23, 2022, 1:47pm

Preface: I like Python, it’s the first language I suggest for new aspiring programmers.

As someone who writes Python somewhat often, I do almost always prefer comprehensions over the alternatives, but only because the alternatives suck.

Compare this comprehension:

numbers = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

result = [n * 10 for n in numbers if n % 2 == 0]
result.sort() # Sort operates in-place and returns `None`, so you can't chain it.

print(result)

...with how you would need to write it if you used map/filter:

result = list(map(lambda n: n * 10, filter(lambda n: n % 2 == 0, numbers)))
result.sort()

There's a bunch of things that suck about it:

map and filter are free functions (and not methods on something like iterable, so you need to nest them, rather than chain them.
The lambda keyword is kinda heavy weight, and there's no implicit parameter names
The result is a lazy iterable, which you need to copy into a list.
Sort can't be chained

The only other alternative I could think of is to use manual loops, but I don't have to explain why that's suckiest of all.

I think the Swift equivalent of this is simply better, fully stop.

let numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

let result = numbers
    .filter { $0.isMultiple(of: 2) }
    .map { $0 * 10 }
    .sorted()

There's a clear linear flow
Unlike a comprehension, you can always know what's available to you by typing . and looking at the auto-completion results.

Another complaint about comprehensions is that the order of their syntax kinda jumps all over the place. Even for a seasoned Python dev, it can be a bit tricky to write complex comprehensions first time without looking at a reference.

Consider even this simple example, and how the control flow jumps around:


result = { n: n*10 for n in numbers if n % 2 == 0}
#             ^ 3  ^ 1              ^ 2
# 1. First you iterate (in the middle)
# 2. Then your predicate is evaluated to filter (at the end)
# 3. Lastly you transform the value (at the start)

ebg · October 23, 2022, 2:37pm

In your example here, doesn't the Python list comprehension evaluate the map/filter in just one combined loop, vs. the two loops for Swift's .filter and .map?

AlexanderM · October 23, 2022, 4:06pm

IIRC, that's an implementation detail that's left unspecified, but if you want lazy behaviour (which isn't always faster btw, two sequential loops can be faster than 1 combined one for some data sizes ... it's complicated), you just tack on one more word:

let result = numbers
    .lazy
    .filter { $0.isMultiple(of: 2) }
    .map { $0 * 10 }
    .sorted()

ktoso · October 24, 2022, 5:04am

It is worth bringing up other languages which do have for comprehensions that don't read so weirdly. I agree with @AlexanderM's writeup a lot here: the order in which one has to read a python for comprehension is pretty weird.

In scala a for comprehention is rather nice, and reads like this:

// numbers = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
// result = [n * 10 for n in numbers if n % 2 == 0]

val numbers = List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1)

// Scala 2
val result = 
    for (n <- numbers if n % 2 == 0) { 
        yield n
    }

// Scala 3
val result = 
    for n <- numbers if n % 2 == 0
        yield n

They can also nest nicely:

// Scala
def foo(n: Int, v: Int) =
   for i <- 0 until n
       j <- 0 until n if i + j == v
   yield (i, j)

foo(10, 10).foreach {
  (i, j) => println(s"($i, $j) ")  // prints (1, 9) (2, 8) (3, 7) (4, 6) (5, 5) (6, 4) (7, 3) (8, 2) (9, 1)
}

this also naturally extends to the "everything is an expression" where for comprehentions return values like that, the same way an if also is an expression etc.

So if anything, I'd rather explore a direction of making for more powerful like that, since it already is quite similar to Swift's powerful for + where, if only it also was allowed to yield.

// just an idea
let numbers = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

let evenNumbers = 
  for (async) n in numbers where n % 0 == 0 { // can even be async etc,
    yield n 
  }

Jean-Daniel · October 24, 2022, 7:31pm

What is the type of evenNumbers here ? A Sequence<Int> or a List<Int> ?

Would this be allowed ?

let evenNumbers = 
  for n in 0... where n % 2 == 0 {
    yield n 
  }

ktoso · October 24, 2022, 10:06pm

In Scala the type is determined basically by the map/flatMap/filter signatures involved. A for comprehension ends up just calling those methods on the underlying collection type, so the result type here would depend on what a map does on 0...n. So arguably, for your example of filtering over the ClosedRange<Int> it'd still be a ClosedRange<Int>.

Gerzer · November 23, 2022, 7:08am

+1 on this! Comprehensions may not be fit for large production codebases where clarity is paramount, but specifically in the realm of numerical programming and ML, they’re almost necessary. There’s a vision for Swift to become a top-tier language for ML and numerics, but without comprehensions, it’s still way too tedious to preprocess matrix and tensor data to feed into a model. This is especially apparent when experimenting in playgrounds as a stand-in for, say, Jupyter, where clarity is less of an issue.

abdel-17 · November 26, 2022, 1:36am

Again, map and filter do exactly that. It’s just different (and imo clearer) syntax.

Gerzer · November 26, 2022, 3:58am

I think that this is an example of the tension between reading and writing code. Comprehensions are easy to abuse to make code highly unreadable (though I would argue that when used judiciously, they can be more readable than even “map” and “filter”). In most cases, readability is most important, so you might want to avoid comprehensions. However, there are some contexts in which the ease of writing, say, complex matrix manipulation is more important than readability. That’s where comprehensions are invaluable. One of those contexts is the Jupyter notebook (or, in the Swift ecosystem, the playground). Right now, it’s far easier to do this in Python, which is a problem if we want to make Swift the go-to language for numerics.

abdel-17 · November 26, 2022, 11:44am

Can you provide an example? I fail to see how complex matrix manipulation would be any easier to do in Swift using comprehensions.

Gerzer · December 13, 2022, 12:17am

Here’s an example of a dictionary comprehension from real-world ML training code:

input_token_map = {word: index for index, word in enumerate(input_words)}

The shortest, cleanest Swift equivalent today would probably be this:

let inputTokenMap = Dictionary(uniqueKeysWithValues: zip(inputWords, inputWords.indices))

This depends on a relatively niche, specialized initializer for Dictionary that isn’t easily generalizable to other collection manipulations. The benefit of the comprehension is that you can use the same mechanism to do all sorts of complex transformations, which reduces cognitive overhead and iteration time. The code that you end up writing isn’t “good” in a production sense, but in the very specific use-case that I’m describing, flexibility and low iteration time are generally more important than readability and maintainability.

Whether it’s the goal of the Swift project to introduce language features for this niche use-case at the risk of adding what amounts to a harmful crutch most of the time is a different question. The existence of a numerics working group suggests that it’s indeed a valued goal.

abdel-17 · December 13, 2022, 12:25am

You sort of forgot swift also has an enumerated property, so the equivalent code is let inputTokenMap = Dictionary(uniqueKeysWithValues: inputWords.enumerated()). I know this is still different, but it’s still just as concise as the python comprehension.

Gerzer · December 13, 2022, 1:27am

enumerated() doesn’t work here because it generates pairs of the form (index, element). The Dictionary initializer expects pairs of the form (key, value), and in this scenario, we want that the element be the key and the index be the value.

AlexanderM · December 13, 2022, 2:38am

I always found the initializers on Dictionary to be pretty clunky. I make myself keyed(by:) and grouped(by:) extensions in all my projects, so I would write this as:

let inputTokenMap = zip(inputWords, inputWords.indices).keyed(by: \.0)

tevamerlin · December 14, 2022, 9:14am

Gerzer:

The shortest, cleanest Swift equivalent today would probably be this:
let inputTokenMap = Dictionary(uniqueKeysWithValues: zip(inputWords, inputWords.indices))
This depends on a relatively niche, specialized initializer for Dictionary that isn’t easily generalizable to other collection manipulations.

You could use this:

let inputTokenMap = words.enumerated().reduce(into: [:]) { $0[$1.element] = $1.offset }

or this:

let inputTokenMap = words.indices.reduce(into: [:]) { $0[words[$1]] = $1 }

which both use a very generalizable approach.

Swift already has ways to write concise, not-super-easily-readable code for this kind of manipulations. I feel like there is no need to add Python’s.

Gerzer · December 19, 2022, 6:04pm

I won’t die on the hill of Python-style comprehensions. I think that there’s a case to be made for them in some scenarios, but the downsides are obvious.

That said, a lack of readability is obviously not a goal for its own sake; rather, it’s more of an acceptable trade-off in certain niches. I still think that input_token_map = {word: index for index, word in enumerate(input_words)} is both more efficient (conceptually, not computationally) and, indeed, more readable than either of your reduce(into:_:) suggestions.

It sounds like you don’t think that the added benefits of comprehensions are sufficient to warrant introducing new syntax. I can respect that position.

1-877-547-7272 · December 20, 2022, 11:39am

I agree that the Python code in your post is more readable than the Swift code. However, I think this is due to a lack of API coverage in the Swift standard library for this use case — not because of anything inherent to comprehensions.

If you were to define a makeDictionary method on Sequence like this:

extension Sequence {
  func makeDictionary<K, V>(
    uniqueKeys keys: (Element) -> K,
    values: (Element) -> V
  ) -> Dictionary<K, V> {
    return withoutActuallyEscaping(keys) { keys in
      return withoutActuallyEscaping(values) { values in
        return Dictionary(
          uniqueKeysWithValues: self
            .lazy
            .map { (keys($0), values($0)) })
      }
    }
  }
}

then the Python code

input_token_map = {word: index for index, word in enumerate(input_words)}

could be expressed in Swift as

let inputTokenMap = inputWords
  .enumerated()
  .makeDictionary(uniqueKeys: \.element, values: \.offset)

which seems (to me) just as conceptually efficient and easy to manipulate as the Python code. I also think it's easier to read, since (1) word and index don't need to be declared in the statement, and (2) the chained functions are evaluated from left to right and top to bottom, just like regular English.

This demonstrates one of the things that I think makes methods on Sequence much better than comprehensions — you can define your own methods. Unlike working with comprehensions, you aren't limited to what the language gives you.

anon9791410 · December 20, 2022, 2:24pm

This looks to me to be just yet another example of Swift not allowing for applying functions at the end of a chain.

Reversing a 2-element tuple is simple enough to understand that I'd do that too here instead of doing

{ ($0.element, $0.index) }

import Algorithms

inputWords.indexed().lazy.map(reverse)
  … Dictionary.init(uniqueKeysWithValues:)

infix operator …

public func … <Value, Transformed>(
  instance: Value,
  tranform: (Value) -> Transformed
) -> Transformed {
  tranform(instance)
}

/// Reverse the order of the elements in the tuple.
@inlinable public func reverse<T0, T1>(_ t0: T0, _ t1: T1) -> (T1, T0) {
  (t1, t0)
}