RFC: "Require" pre-commit review for all PRs that change the compiler

Andrew_Trick · March 25, 2018, 6:28am

In the same spirit as a code review, I’d ask “What could go wrong?” with this idea.

Ok, I'll bite. There are a lot of broad generalizations being thrown around. It all sounds great in principle. The reality is that there are different development scenarios, and not all work well with a mechanically enforced commit barrier that takes flexibility and judgment away from developers.

Incremental development is more important. New prototypes will be
developed on a feature branch, less incrementally, less
transparently, and less collaboratively. The author will avoid
surrounding cleanup because that causes merge overhead. Ideally,
someone would be free to review your prototype as you work on it. I
think that would lead to much better designs and sometimes save the
author a lot of time. Unfortunately, the reality is that often no
one is available to do this, but the developer is still on a
schedule.
Synchronous reviews are disruptive and cause priority inversion
(similar to Swift evolution). For me, reviews often take precedence
over planned work because I know someone is being held up. This
means spending more time reviewing features that are lower priority
than important planned work that, at least in my case, I'm already
spending close to zero time on. There are simply too many
disruptions and inefficiencies as it is.
Inefficiency. It takes me much longer to review a series of patches
than it does to simply review a section of code. If I'm really
reviewing the code, it can sometimes take much longer than just
writing the code myself--it takes more time to explain how to write
code than to write it. When it comes to developing new piece of
code, I would strongly prefer to collaborate via source editors and
commits rather then web forms.
Disincentive to make partial improvements. (I see this all the
time). There's always a better way to write the code and more that
should be done. There are often serious problems with the original
code that are being perpetuated by a change. The reviewer is
obligated to point that out. The committer is already several levels
deep in cleanup only tangential to their real task. Faced with an
inevitable review battle, the partial improvement loses out to time
pressure.
Disincentive to be online/available most of the time. I've noticed,
in past teams, that the much more productive, disciplined engineers
than myself stay out of chat rooms because they are too
disruptive. I had to do the same with LLVM, primarily because it
greatly increased the likelihood of being hit up for inopportune code
reviews (as opposed to important questions).

My understanding is that Swift has the same review policy as LLVM. It has always worked well for LLVM. It's just not working well for Swift. The reasons have nothing to do with the policy though. Real problems:

A very small number of people who understand the code well enough to
truly review changes, but are far too busy with critical,
time-sensitive work to review every PR.
In reality, neither the person making the change nor the reviewer
will have a very deep understanding of the code. That puts both in a
bad position of declaring code "reviewed" when it's just being
rubber stamped.
I'll bet every Swift developer would love to have their PR fully
reviewed precommit. In reality, almost every commit feels like an
emergency at the time. There are always branch deadlines, critical
features being irreversibly punted, outstanding bugs that aren't
getting fixed. Enforcing a review policy doesn't stand up to this
pressure.

Changing the policy won't change the reality, or fix anything by itself. But I do think it might help expose and force solutions to those real problems. So, for that reason I think a strict(er) review policy would improve the status quo. At the very least it might help initially establish a culture that would make the policy unnecessary.

DaveZ · March 25, 2018, 1:39pm

Hi @Andrew_Trick,

With the exception the last paragraph, I wholeheartedly agree with everything you wrote. In the last paragraph though, it seems like you throw your hands up and say, "despite all of these problems, let's try the policy anyway, and hope that it goes away later". Why do you think the outcome will be different this time?

Andrew_Trick · March 25, 2018, 9:35pm

I think it will be extremely difficult to adopt a strict policy, and we'll run into problems. But I'm also not content with the status quo and see some genuine efforts in this thread of exploring ways to make forced precommit review work. I don't think it's ideal to have a mechanically enforced policy (I never agree with artificial barriers to development except when they catch likely mistakes using a tool, like checking into the wrong branch). However, I'm anxious to see changes in how we review each others work.

I would feel a lot better about it if there was a way to indicate "rubber stamp" vs. "reviewed" so we could at least be honest when, for whatever reason, in depth review needs to be deferred.

DaveZ · March 26, 2018, 2:34am

If mandatory pre-commit reviews are going to happen, then I'd like to suggest a way to keep the change latency down and differentiate between full reviews, partial reviews, and "coward" reviews: change the culture around merging pull requests. For example, imagine that a pull request author thinks their changes are ready and the mandatory pre-merge tests have passed. If the reviewer is happy with the final pull request, they have three choices:

Full review — merge after approval. If you're semi-confident that you can fix basic build bot fall out from the pull request, then click 'merge' after clicking 'review'. This doesn't mean that you must fix any and all fall out, but just that you will try to fix basic failures if the original author is busy/away/etc.
Partial review — approved with an actionable comment, but no merge. If you're more likely to immediately revert the pull request than investigate failures after merging, then please write a note why you're not confident enough to merge the change. For example, "I don't understand how the XYZ subsystem works, but the rest of the changes look reasonable". This change is actionable because now the pull request author just needs to find somebody that understands XYZ to complete the review.
"Coward" review — approved without an actionable comment or merging.

I imagine that merging other people's changes might feel scary at first, but it really isn't. Nobody wants to burn their time and their karma by creating plausible-but-junky pull request on a regular basis, and there is no shame in reverting a change. Also, we do have mandatory pre-merge tests. As a merger, you do have a reasonable degree of confidence that things will probably work.

Finally, please note that pull requestors can always request that reviewers not merge for whatever reason. Maybe because weird branches are involved, or maybe because the author would rather deal with the potential fall out directly.

Xi_Ge · March 26, 2018, 6:06pm

I'm totally for this for code on the critical paths, like parser, sema, or sourcekit hot requests (cursor-info, etc). By critical, I mean those code paths that Swift or Xcode users will always hit. However, we do have other areas that are less so, such as sourcekit documentation-generation, swift-api-digester, etc. For these internal tools, I prefer we avoid pre-commit review.

griotspeak · April 2, 2018, 8:15pm

I like the idea and will offer myself as a potential reviewer even of things I haven't touched. I won't be terribly helpful at first but it's an alright way to become acquainted with the codebase as well.

rudkx · April 3, 2018, 12:39am

First, thank you Jordan for raising the issue.

To answer the question of whether we should have mandatory pre-merge reviews, it seems important to first consider the following questions:

What outcome(s) are we hoping to achieve by doing this that are not being achieved now?
In light of (1), what do we consider a minimum bar in terms of time/effort/knowledge-level for reviewing code?

I have worked on projects with mandatory pre-merge review policies and can share some concerns based on those past experiences.

The sort of 'gaming' of the system that @David_Ungar2 mentions, as well as the sort of priority inversion that @Andrew_Trick mentions can both occur in practice. Some contributors end up always asking for reviews from people who they know will glance at the code briefly and approve it unless there are glaring issues, and some reviewers dedicate substantial amounts of time to reviewing the code (including fetching and building the changes, pulling up one or more of the test cases in the debugger and stepping through the compiler changes, etc.).

This project has a fairly high rate of change, and speed of reviews is definitely a concern as others have noted. There are also issues around holidays/vacations where it is entirely possible a selected reviewer will not see a review request until weeks after it has been made, potentially frustrating the person asking for the review (who is unaware of the absence of the reviewer).

What I have typically seen in past projects with pre-merge review processes is that as milestone deadlines approach, time waiting on a review increases as people are attempting to get changes in before the deadline. As a result, review quality begins to decline and more "rubber stamping" starts to happen. It seems important to address these potential pitfalls as well.

As to answering the question of how to match up reviewers to changes when we have contributors who do not know who to ask for a review, I would suggest we seriously consider adopting the CODEOWNERS feature on GitHub.

I would love to see less knowledge siloing but it is not clear that this is a necessary consequence of requiring pre-merge reviews so much as a happy coincidence if more people feel empowered to take the time to learn some new code in order to give a basic once-over looking for simple mistakes that a code author may have inadvertantly missed.

Do not take any of this as a necessary objection to the idea of pre-merge reviews. I very much support the idea of having more eyes on the code, and encouraging more people to participate in development in whatever way they feel they can contribute (which could be reviewing code, writing test cases, reducing test cases in existing JIRAs, as well as of course writing new code). I just think it's important to have clear expectations both for contributors asking for reviews and for reviewers, and to attempt to address some of these potential drawbacks before adopting the policy change.

DaveZ · April 3, 2018, 3:03pm

Hi @rudkx,

For whatever it may be worth, the Linux kernel review policy is effectively the GitHub "CODEOWNERS" model, albeit without GitHub. Code review is done through a parallel hierarchy of code owners and branches where merging implies review.

For example, if you want to get a change into the "example" subsystem, then you send a pull request to the "example" owner. If they merge the change into their "example-next" branch, that implies review. Later, the code owner sends a pull request of "example-next" to Linus, and if he merges the branch, then that implies that Linus is okay with the batched set of changes.

The Linux review model is not perfect either, and biggest complaints I've seen are about the owners being a bottleneck and/or overworked.

If we step back a bit, I think that change review policies tend to fall on a spectrum, where at one end is the strong process model (Linux, GitHub features, etc) and at the other end is the strong culture model (LLVM, etc) which involves post-commit review, no shame reverts, and pre-commit consensus building for nontrivial changes.

I have three observations to make about the review policy spectrum:

The middle of the spectrum is potentially – but not always – the worst of both endpoints. If expertise can exist, then the endpoints work best because expert reviewers need to pay attention, either through a strong process or through a strong culture. In contrast and if expertise cannot exist, then the middle of the spectrum works well because consensus building is the only goal that matters.
I feel that the original proposal falls into the middle of the spectrum, but the middle of the spectrum isn't right for Swift. There is too much subjectivity about what can skip review, who is qualified to review, and what if anything is required of code owners. I assume that under the original proposal, code owners still need to do post-commit review given that the surface level reviews might be of random quality.
Fundamentally, all points on the review policy spectrum are about trying to solve the scaling problems inherent with large projects.

Personally speaking and for most software projects, I like the ends of the spectrum, despite the flaws. Expertise matters and if the problem we're experiencing today is that experts don't feel like they have the time for post-commit review or pre-commit consensus building, then maybe the mandatory-owner-reviewer model is the way to go.

If I might put my ex-Apple hat on for a second, I think the Apple culture is a hybrid of the review policy spectrum end points. At the level of individual changes, the mandatory-owner-review model is similar to Apple's culture of "DRIs" (Directly Responsible Individuals). At the same time, Apple has a "trust-but-verify" / consensus building model between subsystems/teams. I don't see why this hybrid approach cannot work for Swift. The only question is where the lines are drawn.

PS – On the topic of large project scaling and as an ex-Apple person, I always liked how easy it was for individuals to setup mailing lists at Apple. This IT feature encouraged people to self-organize into focused discussions and discouraged "firehose" mailing lists that don't scale in the short term and are counterproductive in the long term. I think that if Swift is going to have subsystems with subsystem owners, then a subsystem forums make a lot of sense from a scaling and productivity perspective.

DaveZ · April 3, 2018, 9:05pm

Given that the conversation has wandered, I'd like to try and summarize the thread and offer a poll to get lurkers off the fence.

1 – The status quo is fine.
2 – Something should be done, but I don't know what.
3 – Review is expected. Exceptions are allowed. Details TBD. "Please be reasonable".
4 – Review is required. Details TBD. "Please don't rubber stamp".
5 – Allow code owners to opt into enforced code review via GitHub's CODEOWNERS feature.
6 – Require code owners to use GitHub's CODEOWNERS feature.
7 – Other. Please see my comments.

0 voters

jrose · April 3, 2018, 11:16pm

I've been mostly off away from discussing this, but the "what do we hope to gain?" question is a good one. To be clear, the case I am most concerned about is the code owners themselves. For example, Doug recently did a major change that redid the implementation of typealiases in order to solve a handful of problems around generic typealiases. The diff was about +1200/-1200 lines, and while most of it was mechanical, nobody read it except Doug. And apart from whether post-commit review is good enough, Doug is the code owner for most of these components, so it's unlikely anyone will read the important parts until they run across an issue.

(I asked Doug if I could use his PR as an example and he said yes.)

Now, this is mostly the way we've always done things in LLVM/Clang and Swift. It also seems like lousy software engineering practice: it's more likely the code has bugs because there's no fresh pair of eyes, and when Doug goes on vacation, there's no backup to ask about the new code.

So my goal with this change is to make sure at least two people have looked at every new line of code, even when a code owner writes code in "their own component". That's really it at this point. It might be a cursory glance that fails to catch much. It might lead to rubber-stamp reviews. But it's a start.

Michael_Gottesman · April 3, 2018, 11:32pm

@jrose @DaveZ One of my concerns here is that it is unclear to me if we have fully explored the problem space beyond what other people have experienced in person already. For instance: why not have something like "enforced post-commit review" before a commit is cherry-picked into a release branch? I think that such a system would enable people to work quickly without requiring coordination in between the committer/reviewer, but also have the properties that Jordan is seeking. Here is a quick strawman:

All engineers would have a queue of patches to review.
Just like today, we do post-commit review so commit velocity does not slow down. When a PR is merged, the PR is added to a code-owners queue.
All code owners would have a queue of patches for review. All commits must be reviewed by a code owners (or a delegate for the code owner) before it is able to be cherry-picked into a release branch. This would be enforced by github.
To provide flexibility for the code owner, the code owner would have the ability to delegate a review to another individual's queue.

I think this would:

Give people time to actually review the code and thus prevent rubber stamping.
Via the delegation system provide the ability for code owners to delegate review to other engineers without needing to rush the other engineer.
Ensure that all commits are read by multiple engineers.

Again, that is just off the top of my head.

EDIT: when I said cherry-picking, I was talking about cherry-picking as well as branching/rebranching a release branch. I elaborated on that below.

jrose · April 3, 2018, 11:39pm

To be clear, I think it's important that this happen on the master branch too, not just for cherry-picks. We're already pretty good about reviews for cherry-picks because of the release process (and the similar Apple-internal process), but everything that gets worked on for the next release doesn't have that. But you're right that a stricter pre-commit policy is not the only possible answer.

Michael_Gottesman · April 3, 2018, 11:41pm

Sorry for being unclear, I am not talking about cherry-picks to release branches. I am talking about branching a new release-branch or rebranching a release branch. In both of those cases, we do not perform mandatory review.

Michael_Gottesman · April 3, 2018, 11:43pm

So, under the scheme that I was laying out, the queue would need to be empty before the rebranch or preparation of a new release branch. The intention is that review would not wait until that time period, but rather would happen in the background. We could have a burn down chart to make sure that no one is getting too many reviews and re-allocate reviewer tasks as appropriate.

DaveZ · April 4, 2018, 12:00am

Hi @jrose,

What do you see the primary purpose of code review as being? Is it to raise the average quality of commits and teach project norms? Or to diffuse expertise? The typealias refactoring example seems to be the latter.

I'm all for better code quality, and if we have a more formalized review process, then I want people to take it seriously. But, as I outlined earlier in this email thread, I think code review is possibly the worst way to go about expertise transfer. I can certainly reiterate the the better approaches if you want.

Also, you can't prove that nobody looked at it. I skimmed part of it. And ya, when I saw that it was largely mechanical, I moved on.

Personally speaking, I'm not bothered by the existence of experts and the light-if-any review changes they make. Even if people like Doug go on vacation, there are still tons of smart people that are fully capable of debugging random code. It just take longer.

jrose · April 4, 2018, 12:07am

If I have to pick between the three, it's to raise the average quality of commits. I've been asking for reviews on nearly every pull request for the last few months anyway, and it has definitely raised the quality of my commits. (You could argue that says more about me failing to learn my weak points than a general lesson, but I don't think I'm particularly unusual in that regard.)

But of the three, the one I'm most concerned about is spreading expertise, because we are suffering terribly from that within Apple when trying to screen new bugs. There are a lot of components where only two people are up to speed enough to fix something in less than a day when there is an emergency. That's just bad.

DaveZ · April 4, 2018, 1:50am

It sounds like what's driving this RFC is not code review per se (which is good) or review policies (which have tradeoffs), but the need to expedite bug screening and fix turnaround time. That's tough, really tough.

Code review policies cannot change what people value, and as long as expertise is valued, then experts will be triage/fix bottlenecks.

To argue that more programmers should be able to triage/fix a greater variety of bugs is a tacit argument against expertise itself.

Panajev · April 4, 2018, 5:43am

I think that The GitHub model of feature branches, that are always up to date with master and get through CI checks, code review, and QA/testing before being merged into master to be the best compromise between over complex branching flows and just merging onto master before code review and testing happens. What I really think it is dangerous and lead to more bugs is to do post merge review and testing. What I think it is best is pre merge code review and testing :).

https://guides.github.com/introduction/flow/

(Hot fixes and GitHub flow: https://hackernoon.com/a-branching-and-releasing-strategy-that-fits-github-flow-be1b6c48eca2 ... not really that important for this project as we do not release with the cadence of an npm module)

If you really must relax rules for some big feature you need engineers to collaborate while they work on their independent sub feature you could have a relaxed featureA_integrationBranch as a bridge between featureA1_Branch and featureA2_Branch and then do the bulk of testing and review before merging featureA_integrationBranch into maste.

tkremenek · April 4, 2018, 7:06am

I suspect if you asked each of the people who voted on your poll in favor of pre-commit code review what they think is the main driver for having pre-commit code review they would come up with slightly different weighing of the reasons.

Speaking for myself, I feel the main value of pre-commit code review is not about expertise transfer. There might be some of that, but if that is the primary goal then a more effective way to accomplish that goal is to have more people work on parts of the codebase they are unfamiliar with and build up expertise by implementing changes. As part of that effort, those individuals would have their changes reviewed by someone with more expertise. Thus pre-commit code review can be an ingredient in expertise transfer, but not sufficient on its own.

There are a few things that stand out in Slava's gist that resonate with me:

To ensure that coding conventions and defensive programming practices are actually enforced.

To ensure that as code evolves, a consistent guiding philosophy is followed even if multiple people work on it or the original authors leave.

To have an extra pair of eyes comb over each piece of code, to catch both obvious and subtle problems.

If I had to summarize these points, it's about making sure we maintain a consistently high level of craft in our changes that are culturally reinforced continuously as part of our development. That's something CI testing can help with, but code review complements that. Is the code in question being written in a complicated way? Will the code be maintainable, and is cleanly factored? Do the tests accommodating the change look sufficient and well-written? These are all things that could essentially come out of post-commit review, but the dynamic there can be very different. Once a change has landed it has landed, and it's not just a minor matter of reverting code that should be sufficiently altered if it has issues as more code can quickly pile on top of it.

Let me get to an important point you raised earlier in the thread:

I don't think this is actually paradoxical. My read is that people are stumbling upon code changes they didn't know about — possibly by dealing with bugs that are collateral damage from a change — and observing that more review of such code changes could have mitigated those issues from showing up in the first place. Thus by dealing with the fallout of inadequate code review people are observing that potentially the current amount of code review is inadequate.

I also do share many of @Andrew_Trick and @rudkx's concerns about potential priority inversions and the pressure to have more rubber stamping as deadlines approach. I also appreciate @Andrew_Trick's concerns about "Inefficiency" in how code review can happen, and the importance of incremental development. No approach will be perfect, but I do think we can approach these concerns directly when evaluating potential pre-commit review strategies or improvements to the post-commit review strategy we have today. I also don't expect code review to be a panacea; even a rubber stamp review might be better in some ways than no review at all, but I think in practice we can do better.

Here's another point worth addressing:

I share this concern. I also think that as the Swift project matures it is important for us to allocate more time in our development work to improving the quality of the code base via various means — more code review, more cleanups and architectural refactoring, etc. — so that the codebase doesn't get crushed under its own proverbial weight. Maybe more code review will slow down the rate of change, but if that leads to higher quality code that may mean the project moves faster in the long run. But it is all about striking the right balance.

Whatever we do here will have tradeoffs. I can easily see that some implementations of a pre-commit code review policy would be truly onerous and counterproductive, adding process to that project that has little value. I can also easily see some advantages to adopting a pre-commit review policy that — even if it is fairly shallow — could yield some tangible benefits with modest overhead. We probably won't know for sure what works for the Swift project unless we try something out and see what works.

DaveZ · April 4, 2018, 1:16pm

Hi @tkremenek and @jrose,

Thanks for the thorough and thoughtful commentary. Might I suggest a way to push this forward?

Create a pull request where the goal is document the norms and goals of code review in Swift.
Update .github/PULL_REQUEST_TEMPLATE.md to give people a heads up that the norms and goals are being formalized. Add a link to the pull request.
Once the dust settles on the "code review norms/goals" pull request and it is merged, then update .github/PULL_REQUEST_TEMPLATE.md once more to strongly encourage people to follow the code review norms/goals of the project, and link to the newly finished code review document.

Good luck,
Dave