With this post, I hope to open a discussion of the design requirements for a library, similar to Python's pprint, that could eventually be incorporated into the standard library and inform the design of many parts of the Swift ecosystem.
Introduction
There are many contexts—from educational/research tools like Playgrounds and Colab Notebooks to industrial programming activities like debugging and logging, in which it's important to be able to easily visualize/understand Swift data structures. For consumption by actual humans, though, Swift's facilities for formatting data leave a lot to be desired. Take a trivial example:
(0..<10).map { Array($0..<10) + (0..<$0) }
If you print
this expression, you'll see:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5, 6, 7, 8, 9, 0], [2, 3, 4, 5, 6, 7, 8, 9, 0, 1], [3, 4, 5, 6, 7, 8, 9, 0, 1, 2], [4, 5, 6, 7, 8, 9, 0, 1, 2, 3], [5, 6, 7, 8, 9, 0, 1, 2, 3, 4], [6, 7, 8, 9, 0, 1, 2, 3, 4, 5], [7, 8, 9, 0, 1, 2, 3, 4, 5, 6], [8, 9, 0, 1, 2, 3, 4, 5, 6, 7], [9, 0, 1, 2, 3, 4, 5, 6, 7, 8]]
Now, if you happen to be in a context like a terminal where you get line-wrapping, this might be enough to give you a sense of what's going on. If the data were any longer, though, it would be a disaster.
Evaluate the expression in the REPL, and you get 122 lines of even less useful output:
$R0: [[Int]] = 10 values {
[0] = 10 values {
[0] = 0
[1] = 1
[2] = 2
[3] = 3
[4] = 4
[5] = 5
[6] = 6
[7] = 7
[8] = 8
[9] = 9
}
[1] = 10 values {
[0] = 1
[1] = 2
[2] = 3
...
The representations we get from LLDB, in the GUI and in the output of p
or po
, are similarly frustrating (the GUI is actually the worst for visualization: it makes me click 11 triangles to reveal the data). I usually end up typing p print(x)
in the debugger to get something I can actually digest. Playgrounds? Don't get me started ; the workaround is similar but far more necessary.
For contrast, now fire up ipython from the command line and evaluate the corresponding expression:
In [3]: [list(range(x, 10)) + list(range(0, x)) for x in range(0, 10)]
Out[3]:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[1, 2, 3, 4, 5, 6, 7, 8, 9, 0],
[2, 3, 4, 5, 6, 7, 8, 9, 0, 1],
[3, 4, 5, 6, 7, 8, 9, 0, 1, 2],
[4, 5, 6, 7, 8, 9, 0, 1, 2, 3],
[5, 6, 7, 8, 9, 0, 1, 2, 3, 4],
[6, 7, 8, 9, 0, 1, 2, 3, 4, 5],
[7, 8, 9, 0, 1, 2, 3, 4, 5, 6],
[8, 9, 0, 1, 2, 3, 4, 5, 6, 7],
[9, 0, 1, 2, 3, 4, 5, 6, 7, 8]]
Brilliant! Not only can I take the whole data structure in at a glance, but I can see the relationships between adjacent rows.
Naturally, not every data structure is this simple to format usefully, but surely we can aspire to do much better than we do today.
The Pitch
To be clear, I'm not proposing to change anything in the tools, language, or standard library in the near term; there's far too much to be explored and proven out in a separate package first. I propose to design a library that serves the same purpose as pprint, which is what ipython uses to generate the result above. I believe, if we get it right, similar principles could eventually be applied to improve standard library print, debuggers, playgrounds, and the REPL (of course, if Apple wants to run with some of these ideas and begin improving tools earlier, I'm sure nobody will complain ).
As far as I know, there's no existing Swift library that serves the purpose. The point of this thread is to discuss the design of such a library, which features are essential, and what problems need to be solved. I'll kick it off with a few things to think about:
- References: Ideally one never formats an instance twice. When an instance is referenced from multiple points in a data structure, how is it presented?
- Abbreviation: in my use cases, data can be extremely large, and it's sometimes going important to see the “shape” of the data without all of the detail.
- How can we effectively abbreviate an array?
- I might want to see the first few elements and then an ellipsis.
- I'd still want to know the length; how do I print that?
- I think this approach generalizes to most types
- Are there data structures that need different treatment?
- How can we effectively abbreviate an array?
- Columns: is it important to see data structures on separate lines organized into columns?
Thanks for your attention,
Dave