I’m noticing more and more proposals use the SCS to argue that “x% of swift users do this” or “y is fine because nothing in the SCS breaks” and I’m getting worried because I always thought the SCS was meant as a partial safeguard against small changes in the compiler accidentally breaking source compatibility.
Swift compiler developers can now use Swift’s pull request testing system to test their changes against the source compatibility test suite, helping catch source compatibility regressions before they are merged.
But now the SCS itself is being used as an actual reason for making or not making changes in the language, which is a much more important role. The SCS doesn’t accept GPL-licensed projects, so there is now a huge swath of projects and libraries out there that aren’t being tested, and major changes to the language are being discussed without considering how these projects will be impacted, with the assumption that all major (open source) users of Swift are being covered. I don’t think we should allow the SCS to be used as a rationale in an evolution proposal.
The projects in the source compat suite are (of necessity!) a sample, not an exhaustive census. That's fine; sampling is fine. The relevant question here is whether they are a representative sample.
I find it hard to imagine that ruling out GPL licenses introduces any kind of selection bias for code style or language features. GPL code doesn’t look different than other kinds of open source code — other than the license file.
There is one source of selection bias that is worth considering: the suite is overwhelmingly libraries, whereas I assume most actual Swift code out in the wild is user-facing apps of various kinds. Those two families of projects certainly have different styles and structures, and are likely to use language features in different proportions. If anything, I would guess that libraries overrepresent complex and/or esoteric features, so “feature X is never used in libraries” is probably a stronger statement than “feature X is often used in libraries.” Regardless, drawing sharp scientific conclusions from the compat suite isn’t a good idea; one would need a truly representative sample.
We aren't drawing sharp scientific conclusions in these forums. In most cases I’ve seen the suite mentioned, it was in the context of asking whether features are in widespread use at all in order to estimate the impact of a breaking change. This is an “order of magnitude” sort of question, one for which I think the suite in its current form is entirely adequate.
That said, getting more apps into the suite certainly wouldn’t hurt.
I'll second what Paul just said, and I'll add that the Core Team knows that the source compatibility suite isn't a representative sample of Swift code. Even the complete set of open-source Swift projects wouldn't be representative — open-source Swift projects tend to be small pure-Swift libraries, but Swift code on the whole is often part of a larger project, often a mixed-source project, and that project is usually an app. The source-compatibility suite is an easy-to-acquire piece of evidence about the impact of a change, but it's far from dispositive.
Still, the thing was made as a test suite to catch implementation mistakes, and the fact that it’s starting to factor into peoples’ decisions when designing language features is something we really need to think about.
The compatibility suite is not perfect, but it is the best source of real-world Swift code we have. It's biased towards frameworks rather than apps, to be sure. But in designing a language feature, I'm not sure that's even a problem. It is certainly less flawed than general GitHub searches, which tend to disproportionately favor sample or toy or example rather than real-world use cases.
In general, you should think of the compatibility suite as similar to a good suite of tests. No matter how good they are, tests only prove the presence of bugs, not their absence. In the same way, the compat suite proves the presence of compatibility issues, not their absence.
This means that passing the suite is necessary, but not sufficient, to justify a change that is known in theory to break source. If the suite reveals extensive breakage, then that would rule a change out. If it shows minimal or no breakage, it means you might be able to break source with a very strong justification (for example, in the face of a serious correctness issue encouraged by an API).
We should also strive to avoid, and to be skeptical of, bold claims made without evidence (for example, that there are huge swathes of GPL-licensed Swift code out there). We should therefore encourage, not discourage, use of evidence from the compatibility test suite in arguments for or against evolution proposals. We just need to be aware of exactly what finding or not finding compat suite evidence actually means.