A new SIL transform needs to correctly handle all potential SIL programs without crashing or miscompiling (or breaking OSSA). @Michael_Gottesman@Andrew_Trick
Differentiating function bodies (as part of differentiable programming in Swift) needs to work for all supported Swift function bodies (represented in syntax trees and the AST, and eventually SIL and LLVM IR). @rxwei
Question
How do I test language coverage?
I can do my best, find bugs, and check in unit tests preventing regression. Or I can have some foresight and preemptively think of untested buggy edge cases, and check in unit tests. Or I can add integration tests based on popular Swift community packages. Or I can...
There are many technology options to solve this testing use case. Which is best? Example criteria:
Flexibility and expressivity (what can I do with this tech)
Practicality (effort of implementing and maintaining it)
Maintainability (modularity, separation of concerns, is it written as a SwiftPM package)
Right now, we have the following technologies:
Swift Source Compatibility Suite: highly pragmatic and maintainable. Not expressive, limited to regression testing (only works with existing code, doesn't preemptively discover any bugs for new code).
Swift Evolve: pragmatic and maintainable. Very limited expressivity so far.
Naive fuzzing. See @practicalswift and their fuzzing tests - it mostly catches garbage Swift code that crashes the parser or name lookup.
Solution
I like property-based testing, at least for use case (3) above. @marcrasi and @shabalin implemented variants of generators of random Swift code, for different purposes: autodiff correctness (Marc) and autodiff performance for different language features (Denys).
Denys' work: generate a simple Python AST and generate Swift code from it with specific language features enabled. (not yet open-sourced, ask us if you're interested)
Are there good property-based testing libraries suitable for generating simple ASTs, lower-able to Swift? SwiftCheck is hefty, I think it could be distilled to a simpler core with less abstraction and fewer cute syntax. @codafi
Any other solution ideas that hit a criteria-satisfaction sweet spot?
From my early experiments in this area: it's easy to combinatorially generate random syntactic trees, but it's quite hard to generate non-trivial random trees that pass all static checks and correspond to valid programs that do something remotely resembling real workloads.
This has been explored in greater detail in projects such as CSmith.
@shabalin just provided supporting details via DMs:
All this (random generation, aka property-based testing) sounds highly expressive, practical, and maintainable. A nice implementation with minimal complexity would be ideal.
There's still research in this area. I enjoyed the Chalmers FP 2020 seminar "Backtracking Generators for Random Testing" on the topic (spectrum between brute enumeration and random exploration):
I got C-Reduce to work with Swift programs and a Swift-compiler-invoking interestingness test (after meeting and bugging John Regehr at LLVM Dev Meeting 2019).
It works okay. It is very "non-intelligent", I found the reducer to be highly antagonistic (it tries as hard as possible to reduce to an empty file), which was fun to fight and outsmart. I would like to write a blog post sometime.