[GSoC 2020] Proposal for regular expressions worthwhile/possible?

Hi, my name is William and I'm a sophomore studying computer science at Cornell. My areas of interest include computability theory, compilers, and iOS development.

Recently, Chris Lattner gave an interview on the Accidental Tech Podcast (Ep. 371, 1:49:00) on which he mentioned in passing that he thought regular expression support should be improved in Swift. Some possible features for regular expressions I gleaned from reading [1], [2]:

  • Compile-time checking that a regular expression is well-formed
  • Pattern matching of capture groups
  • Regex interpolation
  • Proper support for Swift strings and characters

This kind of task feels like it would be in my wheelhouse: I feel comfortable with the theory behind regular expressions and Swift's type system from a user perspective, and I want to apply my knowledge of computability theory to something significant. I've also done a bit of hacking on LLVM and clang.

However, as a student interested in contributing to Swift, is it possible to make this a Google Summer of Code project? I ask for a few reasons:

  1. Regular expression support would touch many components of the compiler

    Other potential projects seem well localized to a certain area of the compiler or are purely additive. Regular expression support would ostensibly need to touch parsing, type-checking, and the standard library.

  2. Since it would be an addition to Swift proper, it would go through Swift evolution

    It seems the other GSoC projects mostly extend the tooling or improve existing swift features without changing syntax.

  3. The deadline for GSoC proposals is in four days, and two of those days are weekends

    This is a tight deadline and I fear may be the biggest factor.

Any advice would be appreciated. I am interested in working on this even if I'm unable to submit a proposal on time, but the additional structure and funding offered by GSoC is certainly a motivation.

1 Like

In my opinion, regular expressions are a great area where exploring the space using libraries first might be a better alternative compared to baking support into the stdlib and/or compiler right away. There are many different positions in the design space in terms of performance, ergonomics, type safety, pre-compilation and so on. I don't know what types people commonly use for regexes in Swift (apart from NSRegularExpression), but it might be useful to look at regex libraries in other languages for inspiration.