Proposal: Python's list, generator, and dictionary comprehensions

If further discussion is desired on this topic would folks mind moving it to a new thread? Previous participants from back in 2015 may be getting unwanted notifications/emails from this thread.

5 Likes

Swift has class inheritance because it was designed to inter-op with Objective-C. Hard to call ObjC a popular language.

Beyond that, I think you're being hyper-literal just to argue a point. If I had written that Swift does not take every feature from every popular language merely because it's popular, would that have been better understood?

NIH = Not Invented Here. It's an excuse often cited for why a group reinvents the metaphorical wheel.

Preface: I like Python, itā€™s the first language I suggest for new aspiring programmers.

As someone who writes Python somewhat often, I do almost always prefer comprehensions over the alternatives, but only because the alternatives suck.

Compare this comprehension:

numbers = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

result = [n * 10 for n in numbers if n % 2 == 0]
result.sort() # Sort operates in-place and returns `None`, so you can't chain it.

print(result)

...with how you would need to write it if you used map/filter:

result = list(map(lambda n: n * 10, filter(lambda n: n % 2 == 0, numbers)))
result.sort()

There's a bunch of things that suck about it:

  1. map and filter are free functions (and not methods on something like iterable, so you need to nest them, rather than chain them.
  2. The lambda keyword is kinda heavy weight, and there's no implicit parameter names
  3. The result is a lazy iterable, which you need to copy into a list.
  4. Sort can't be chained

The only other alternative I could think of is to use manual loops, but I don't have to explain why that's suckiest of all.

I think the Swift equivalent of this is simply better, fully stop.

let numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

let result = numbers
    .filter { $0.isMultiple(of: 2) }
    .map { $0 * 10 }
    .sorted()
  1. There's a clear linear flow
  2. Unlike a comprehension, you can always know what's available to you by typing . and looking at the auto-completion results.

Another complaint about comprehensions is that the order of their syntax kinda jumps all over the place. Even for a seasoned Python dev, it can be a bit tricky to write complex comprehensions first time without looking at a reference.

Consider even this simple example, and how the control flow jumps around:


result = { n: n*10 for n in numbers if n % 2 == 0}
#             ^ 3  ^ 1              ^ 2
# 1. First you iterate (in the middle)
# 2. Then your predicate is evaluated to filter (at the end)
# 3. Lastly you transform the value (at the start)
11 Likes

In your example here, doesn't the Python list comprehension evaluate the map/filter in just one combined loop, vs. the two loops for Swift's .filter and .map?

IIRC, that's an implementation detail that's left unspecified, but if you want lazy behaviour (which isn't always faster btw, two sequential loops can be faster than 1 combined one for some data sizes ... it's complicated), you just tack on one more word:

let result = numbers
    .lazy
    .filter { $0.isMultiple(of: 2) }
    .map { $0 * 10 }
    .sorted()
3 Likes

It is worth bringing up other languages which do have for comprehensions that don't read so weirdly. I agree with @AlexanderM's writeup a lot here: the order in which one has to read a python for comprehension is pretty weird.

In scala a for comprehention is rather nice, and reads like this:

// numbers = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
// result = [n * 10 for n in numbers if n % 2 == 0]

val numbers = List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1)

// Scala 2
val result = 
    for (n <- numbers if n % 2 == 0) { 
        yield n
    }

// Scala 3
val result = 
    for n <- numbers if n % 2 == 0
        yield n

They can also nest nicely:

// Scala
def foo(n: Int, v: Int) =
   for i <- 0 until n
       j <- 0 until n if i + j == v
   yield (i, j)

foo(10, 10).foreach {
  (i, j) => println(s"($i, $j) ")  // prints (1, 9) (2, 8) (3, 7) (4, 6) (5, 5) (6, 4) (7, 3) (8, 2) (9, 1)
}

this also naturally extends to the "everything is an expression" where for comprehentions return values like that, the same way an if also is an expression etc.

So if anything, I'd rather explore a direction of making for more powerful like that, since it already is quite similar to Swift's powerful for + where, if only it also was allowed to yield.

// just an idea
let numbers = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

let evenNumbers = 
  for (async) n in numbers where n % 0 == 0 { // can even be async etc,
    yield n 
  }
12 Likes

What is the type of evenNumbers here ? A Sequence<Int> or a List<Int> ?

Would this be allowed ?

let evenNumbers = 
  for n in 0... where n % 2 == 0 {
    yield n 
  }

In Scala the type is determined basically by the map/flatMap/filter signatures involved. A for comprehension ends up just calling those methods on the underlying collection type, so the result type here would depend on what a map does on 0...n. So arguably, for your example of filtering over the ClosedRange<Int> it'd still be a ClosedRange<Int>.

+1 on this! Comprehensions may not be fit for large production codebases where clarity is paramount, but specifically in the realm of numerical programming and ML, theyā€™re almost necessary. Thereā€™s a vision for Swift to become a top-tier language for ML and numerics, but without comprehensions, itā€™s still way too tedious to preprocess matrix and tensor data to feed into a model. This is especially apparent when experimenting in playgrounds as a stand-in for, say, Jupyter, where clarity is less of an issue.

Again, map and filter do exactly that. Itā€™s just different (and imo clearer) syntax.

1 Like

I think that this is an example of the tension between reading and writing code. Comprehensions are easy to abuse to make code highly unreadable (though I would argue that when used judiciously, they can be more readable than even ā€œmapā€ and ā€œfilterā€). In most cases, readability is most important, so you might want to avoid comprehensions. However, there are some contexts in which the ease of writing, say, complex matrix manipulation is more important than readability. Thatā€™s where comprehensions are invaluable. One of those contexts is the Jupyter notebook (or, in the Swift ecosystem, the playground). Right now, itā€™s far easier to do this in Python, which is a problem if we want to make Swift the go-to language for numerics.

Can you provide an example? I fail to see how complex matrix manipulation would be any easier to do in Swift using comprehensions.

Hereā€™s an example of a dictionary comprehension from real-world ML training code:

input_token_map = {word: index for index, word in enumerate(input_words)}

The shortest, cleanest Swift equivalent today would probably be this:

let inputTokenMap = Dictionary(uniqueKeysWithValues: zip(inputWords, inputWords.indices))

This depends on a relatively niche, specialized initializer for Dictionary that isnā€™t easily generalizable to other collection manipulations. The benefit of the comprehension is that you can use the same mechanism to do all sorts of complex transformations, which reduces cognitive overhead and iteration time. The code that you end up writing isnā€™t ā€œgoodā€ in a production sense, but in the very specific use-case that Iā€™m describing, flexibility and low iteration time are generally more important than readability and maintainability.

Whether itā€™s the goal of the Swift project to introduce language features for this niche use-case at the risk of adding what amounts to a harmful crutch most of the time is a different question. The existence of a numerics working group suggests that itā€™s indeed a valued goal.

You sort of forgot swift also has an enumerated property, so the equivalent code is let inputTokenMap = Dictionary(uniqueKeysWithValues: inputWords.enumerated()). I know this is still different, but itā€™s still just as concise as the python comprehension.

enumerated() doesnā€™t work here because it generates pairs of the form (index, element). The Dictionary initializer expects pairs of the form (key, value), and in this scenario, we want that the element be the key and the index be the value.

3 Likes

I always found the initializers on Dictionary to be pretty clunky. I make myself keyed(by:) and grouped(by:) extensions in all my projects, so I would write this as:

let inputTokenMap = zip(inputWords, inputWords.indices).keyed(by: \.0)
5 Likes

You could use this:

let inputTokenMap = words.enumerated().reduce(into: [:]) { $0[$1.element] = $1.offset }

or this:

let inputTokenMap = words.indices.reduce(into: [:]) { $0[words[$1]] = $1 }

which both use a very generalizable approach.

Swift already has ways to write concise, not-super-easily-readable code for this kind of manipulations. I feel like there is no need to add Pythonā€™s.

I wonā€™t die on the hill of Python-style comprehensions. I think that thereā€™s a case to be made for them in some scenarios, but the downsides are obvious.

That said, a lack of readability is obviously not a goal for its own sake; rather, itā€™s more of an acceptable trade-off in certain niches. I still think that input_token_map = {word: index for index, word in enumerate(input_words)} is both more efficient (conceptually, not computationally) and, indeed, more readable than either of your reduce(into:_:) suggestions.

It sounds like you donā€™t think that the added benefits of comprehensions are sufficient to warrant introducing new syntax. I can respect that position.

I agree that the Python code in your post is more readable than the Swift code. However, I think this is due to a lack of API coverage in the Swift standard library for this use case ā€” not because of anything inherent to comprehensions.

If you were to define a makeDictionary method on Sequence like this:

extension Sequence {
  func makeDictionary<K, V>(
    uniqueKeys keys: (Element) -> K,
    values: (Element) -> V
  ) -> Dictionary<K, V> {
    return withoutActuallyEscaping(keys) { keys in
      return withoutActuallyEscaping(values) { values in
        return Dictionary(
          uniqueKeysWithValues: self
            .lazy
            .map { (keys($0), values($0)) })
      }
    }
  }
}

then the Python code

input_token_map = {word: index for index, word in enumerate(input_words)}

could be expressed in Swift as

let inputTokenMap = inputWords
  .enumerated()
  .makeDictionary(uniqueKeys: \.element, values: \.offset)

which seems (to me) just as conceptually efficient and easy to manipulate as the Python code. I also think it's easier to read, since (1) word and index don't need to be declared in the statement, and (2) the chained functions are evaluated from left to right and top to bottom, just like regular English.

This demonstrates one of the things that I think makes methods on Sequence much better than comprehensions ā€” you can define your own methods. Unlike working with comprehensions, you aren't limited to what the language gives you.

1 Like

This looks to me to be just yet another example of Swift not allowing for applying functions at the end of a chain.

Reversing a 2-element tuple is simple enough to understand that I'd do that too here instead of doing

{ ($0.element, $0.index) }
import Algorithms

inputWords.indexed().lazy.map(reverse)
  ā€¦ Dictionary.init(uniqueKeysWithValues:)
infix operator ā€¦

public func ā€¦ <Value, Transformed>(
  instance: Value,
  tranform: (Value) -> Transformed
) -> Transformed {
  tranform(instance)
}
/// Reverse the order of the elements in the tuple.
@inlinable public func reverse<T0, T1>(_ t0: T0, _ t1: T1) -> (T1, T0) {
  (t1, t0)
}