Would it be possible to add this documentation to a different file in the docs directory, just not TypeChecker.md specifically?
What would help people understand the constraint system before diving into code?
I've been meaning to contribute to this area for long time, but I don't even know where to start. If such topics are only documented in code comments, I don't even know what code to look at to find these comments. The available information (albeit quite scarce) in the docs directory is what helped me to at least get some superficial understanding of what could be going on. But I still can't say I know how to orient myself in the Sema codebase. Markdown files in docs at least are interlinked and have formatting rendered in-place when viewing on GitHub, as compared to code comments.
This would be different if Swift codebase had Doxygen output hosted somewhere for viewing doc comments that are attached to code. Until that's implemented, docs is the only introductory source for beginners like me.
Let's split this discussion off into a separate compiler development thread. (I have no idea if/how I can do that, maybe @John_McCall can help?) This is a common problem with the constraint system - it's not very approachable, it's difficult to get a high-level understanding of how it works let alone the implementation details, etc. It would be great to figure out how we can ease the learning curve for folks looking to contribute in this area.
What we really need is better documentation around constraint solver, I wrote a blog post about new diagnostic infrastructure a few years ago which provided some details about constraint solver but it wasn't the focus. If you are interested to contribute I think it would be a good starting point - you'd learn how everything works and lower the bar of all the others. And I can help you out and answer the questions.
I think, before diving into the code, it's most helpful to have an understanding of the three different phases (constraint generation, constraint solving, and solution application), be familiar with some of the constraint system concepts/terminology (e.g. "type variable", "constraint", "simplification"), and have a high-level understanding of the solving algorithm, which attempts type variable bindings, attempts to "simplify" constraints on that type variable, and back tracks if any of those constraints fail.
Much of the detail that's currently in TypeChecker.md, such as listing out a bunch of specific kinds of constraints, is not actually helpful information to know upfront before diving into the code, and I think it actually blocks people from moving forward with contributing because they feel like they have to understand all of these details before getting into the code.
I think part of what's missing from TypeChecker.md is information about where to actually look for these things in the code. For example, when it talks about constraint generation, it could link to CSGen.cpp and provide instructions on where to set a breakpoint for folks who are getting into this code for the first time and want to step through it. Similarly, for constraint simplification, it could point to ConstraintSystem::simplifyConstraint in CSSimplify.cpp. I recall having to write out some of this information in forum comments a few times for interested GSoC students, for example:
This would be awesome!
And for sure is something I would be interested in help with as well. Although not an expert, sema diagnostics is the main area of contributions for me, so I'm up to help as well =]
Another piece we could add is a set of tips on how to "read" -debug-constraints output. It is really helpful, but can be intimidating for new contributors specially when a lot of disjunctions are involved.