LLVM monorepo transition update

I would like to provide an update to the post from last month (LLVM monorepo transition), where I announced that Apple is working on the LLVM project monorepo transition plan for Swift.

Since then, Duncan (@dexonsmith), Mishal (@mishal_shah) and I have been working on creating the prototype monorepo that can be used by Swift. As a result of that work, the following four repositories have been created on github.com/apple:

More details follow on these repositories and how to work them them.

Prototype of the llvm-project monorepo

llvm-project-v1 is a (read-only) prototype of the new monorepo history, where commits from the "split" repositories have been interleaved into merged histories on top of the upstream at llvm.org.

General Branch organization

There are three main branch prefixes:

  • The prefix llvm.org/ is used to republish branches from github.com/llvm/llvm-project. They are exact replicas of upstream and are read-only.
  • The prefix apple/ is used for branches that do not depend on the swift repository. These currently exclude all LLDB changes changes from upstream.
  • The prefix swift/ is used for branches that do depend on the swift repository. For example, the branch swift/master depends on swift’s master branch.

There are no historical tags (yet), but we’d likely want them to have naming hierarchies within that to facilitate custom refspecs .

Branches downstream of master

There are two branches downstream of llvm.org/master:

Stable branches

There are three (active) stable branches:

  • apple/stable/20190104 interleaves commits from each non-LLDB split repo’s swift-5.1-branch branch, and is named by the date of the branch point.
  • swift/swift-5.1-branch depends on swift’s swift-5.1-branch, and interleaves commits from LLDB’s swift-5.1-branch with generated merges from apple/stable/20190104 .
  • swift/master depends on swift’s master branch, and interleaves commits from each split repo’s stable branch. This is downstream of swift/swift-5.1-branch.

There is also a suite of stable branches going back to swift/swift-5.1-branch.

Making the history canonical

It's unlikely the histories in llvm-project-v1 will become canonical since v1 will have problems that should be addressed before the monorepo is canonical. The idea is to regenerate it, fixed, as llvm-project-v2. Once all the blocking issues are resolved, the specific llvm-project-vX will be renamed to llvm-project. The old split repos will be archived past that point.

Using the prototype for development

Even though llvm-project-v1 is read-only, it can be used for development. Changes can be pushed back to the split repos using git apple-llvm push .

Development workflows are still a work in progress. If you decide to try them out, be aware they could be full of bugs and that the branches in llvm-project-v1 may be overwritten with force pushes.

There is some documentation for the tools and workflows in the docs directory in apple-llvm-infrastructure-tools. For example:

There’s also some documentation for using the monorepo with Swift’s update-checkout script:

You can generate the HTML pages for the documentation by running the make html target in the docs directory.

Getting involved

We'll send another update when the blocking issues are resolved, and we think the monorepo is ready for everyone to switch over.

In the meantime, early adopters are encouraged to test it out, give feedback, and/or contribute to the tools.

2 Likes

Thanks for the status update and for taking on a potentially difficult task!

First, some "big" questions (then minor feedback):

  • Why is LLDB the exception to this work?
  • Will the apple/* branches represent the open-source version of Apple's ObjC++ compiler? (Sans unreleased work, of course.)
  • Why republish the LLVM branches as llvm.org/* when the "git way" is to tell people to git remote add the canonical LLVM repository (if they care)? At best, the branch aliases are noise. At worst, they risk getting out of sync and creating confusion. What genuine problem is this solving?
  • Will there not be an active branch named stable? Having to chase date-stamped branches like apple/stable/20190104 seems alien and weird.
  • Repositories are "free". Why not have two repositories? One for the apple/* branches (sans apple/ prefix), and one for the swift/* branches (sans swift/ prefix)?
  • Last but most important, what is the plan for the Swift repository? Personally, I'd strongly consider leaning into the "monorepo" workflow and merge swift itself into the repository while we're at it.

Now, for the minor feedback:

  • https://git.llvm.org/git/monorepo-root.git doesn't seem to work:
    $ git clone https://git.llvm.org/git/monorepo-root.git
    Cloning into 'monorepo-root'...
    fatal: repository 'https://git.llvm.org/git/monorepo-root.git/' not found
    
  • This is minor: https://github.com/apple/llvm-project-v1 is "a WIP downstream fork". Why not have it marked as such on GitHub? Said differently, if we're going to get the history "right", why not get the "forkage" right on GitHub?

Overall, I support this history rewriting work (because it is effectively forced by llvm.org abandoning the "split" git repositories rather than "subtree merging" them into the monorepo). I also really appreciate that git history rewriting invites "mission creep" at best and lost/broken history worst.

Good luck!
Dave

To your point of merging the Swift repository into the "monorepo", which I don't think that Swift is ready for yet. I really do prefer the unified builds and would love to see the standalone builds go away. Unfortunately, they still pose some problems: chiefly, Swift has abandoned LLVM. It has its own (poor) implementation of CMake (in CMake), with its own installation mechanism and distribution mechanism. It does not really fit well into the LLVM style of development. I've been trying to get that resolved, but have had considerable trouble getting the necessary changes into the tree.

The work to do this basically amounts to replacing the custom component system for CMake's component system (which Alex has done some work towards), and integrating with LLVM's distribution targets (which I had some patches for but they caused too much internal conflict for Apple and ended up getting reverted a couple of times). If we can get that work completed, I think that Swift will very easily slip into the same setup and enable a robust unified build (which is both more efficient and easier to maintain).

@Michael_Gottesman has some very strong opinions on this (particularly the custom installation system) and it would probably be good to have him chime in on this.

That's certainly quite a provocative way to say "Swift's build system doesn't integrate nicely into LLVM's build system patterns".

That said, I agree that it would be nice in principle to unify the build systems.

1 Like

Hi @compnerd – My memory is hazy in this regard, but I feel fairly confident that this is an accident of history. Swift, like LLVM, used to have both Makefiles and CMakeLists.txt, and the Makefiles were the primary build system at first. Naturally, this meant that the CMake build system tended to limp along and was maintained by people that weren't experts at CMake (or build systems for that matter). But unlike LLVM, we didn't have the time or resources to clean up the CMake setup after the Makefiles were removed. As time passed, the hacky CMake setup became harder to unwind and easier to defend as intentional.

Hey compnerd.

I think the words "Swift has abandoned LLVM" is a bit extreme. I would say that instead we are paying down historical debt that has been accrued over tiem. As we have spoken, I think the future world is one where we have swift's own specific build/requirements layered on top of LLVM's build. I appreciate your enthusiasm for the expedient completion of this (as you know I share that enthusiasm). That being said, it isn't as simple as lets just commit the stuff now and break all of the things. Instead this process is more akin to changing the engine on an 18 wheeler without causing an accident and spilling the goods all over the freeway. That is, as a result of this change, we can not cause huge problems for the rest of the project!

With that in mind that me respond line by line:

  1. As we have spoken in terms of the custom cmake implementation, the path forward is already clear: we split out the stdlib build, sink the badness into the stdlib build, and then use pure LLVM cmake goodness for swift's host side stuff. Then we have a few options forward for the stdlib build (I am not wedded to a specific solution) that will then let us eliminate the old implementation.

  2. The installation/distribution mechanism issue you have alluded to is solved by overlaying the swift component system on top of LLVMs which we both agree is possible and nips the LLVM vs Swift installation system in the bud.

  3. I don't know what you mean by LLVM style development. Elaborate?

Thanks for your great feedback! I hope that I answered all the questions you had below:

  • LLDB is not an exception to this work, the contents of swift-lldb is preserved in the monorepo. However, it is one of the factors why the branch naming scheme is now different. swift-lldb requires a Swift checkout in order to build. We wanted the branches in the apple/* namespace to be standalone, i.e. buildable without a Swift checkout. That's why the apple/* branches contain the LLDB contents from llvm.org. The swift-lldb history and contents is mapped into the swift/* branch namespace in the monorepo, which makes the fact that those branches are not standalone more obvious.

  • The branches in the apple/* namespace are meant to be direct replacements for the existing upstream-with-swift branches or the release branches in the split swift-{llvm,clang} repositories. I can't say they fully represent an open-source release of Apple clang yet, but I think we will be moving towards that direction.

  • At the moment the llvm.org/* branches are there to simplify the logistics for the auto merger that we're working on.

  • stable is too ambiguous of a branch name. Engineers will work in different contexts, and will care about different stable branches (e.g. swift releases vs Apple clang). I think it would be better to focus on creating the tools to support workflows where people wouldn't have to care about the particular branch names as much.

  • Having more than one repository is definitely an interesting idea, but not something we really considered. However, even if there were two repositories, the branch naming prefixes would still be useful. They are there to make the separation between different contexts more clear. The goal is to give engineers as much context as possible when working with the monorepo, because with the llvm.org monorepo and the GitHub transition it will become much easier to get confused about the context and make unneeded mistakes.

  • The Swift monorepo is out of scope of this transition. We are constrained by time as llvm.org is switching over soon, and want to ensure that the github.com/apple/llvm-project transition happens before then.

  • You're right, I missed the fact that there's no llvm.org git mirror for the monorepo root (https://git.llvm.org/git/monorepo-root.git) in the post. The master branch on GitHub.com/apple/llvm-monorepo-root is created from the llvm.org svn repository itself (https://llvm.org/svn/llvm-project/monorepo-root/trunk).

  • We haven't considered marking it as an official fork so far. I'll try to look into how feasible that would be.

Hi @Alex_L,

Thanks for the thorough reply! Here are some follow up questions and observations:

  • "The branches in the apple/* namespace are meant to be direct replacements for the existing upstream-with-swift branches or the release branches in the split swift-{llvm,clang} repositories. I can't say they fully represent an open-source release of Apple clang yet, but I think we will be moving towards that direction." – I think that if we're not careful, people will assume that these branches are the open-source version of Apple's clang. If they're not "Apple clang", then why not call them swift-support/* or swift-supplement/*?
  • "stable is too ambiguous of a branch name." – That's fine. Can we still have something without a date in the name? For example "swift-master-stable" or "swift-v5.1-stable". Let's please remember that the whole point of named branches is that people shouldn't need to chase version numbers, tags, or date stamps (like apple/stable/20190104) in order to do continuous development.
  • "I think it would be better to focus on creating the tools to support workflows where people wouldn't have to care about the particular branch names as much." – I see the appeal of this logic and there are certainly engineers that need or stand to benefit from such tools/scripts. That being said, I truly hope that these scripts don't become required.
  • As far as marking Apple repositories as official forks, we're admittedly starting "backwards" here because we already have the v1 repo. The workaround:
    1. Somebody with admin power over Apple's GitHub Account needs to login
    2. Go to https://github.com/llvm/llvm-project
    3. Click/tap the "fork" button and select the Apple account for the resulting fork (not one's personal account)
    4. From your favorite git client, push the entire "v1" repository into the Apple fork.
    5. From your favorite git client, delete unwanted branches/tags from the Apple fork.
    6. Delete the v1 repo and then rename the fork "v1"

If they're not "Apple clang", then why not call them swift-support/* or swift-supplement/* ?

We want to discourage Swift-specific LLVM/Clang work outside of the upstream llvm-project. I think it would be great if one day apple/master became redundant as llvm.org/master had everything you needed to build Swift.

Can we still have something without a date in the name? For example "swift-master-stable" or "swift-v5.1-stable". Let's please remember that the whole point of named branches is that people shouldn't need to chase version numbers, tags, or date stamps (like apple/stable/20190104 ) in order to do continuous development.

We do have such branches already! We have swift/master, which is equivalent to today's stable, and swift/swift-5.1-branch, which is equivalent to today's swift-5.1-branch. We will have auto mergers that enforce the following branch hierarchy for llvm-project for the future branches as well:

apple/stable/DDMMYYYY -> swift/swift-X.Y-branch -> swift/master

For engineers working on Swift, it will become way easier to know which llvm-project branch you need to checkout in order to build Swift with the monorepo. The rule is simple: if you checkout branch X in swift, you need to checkout branch swift/X in llvm-project.

The apple/stable/* branches will be important for those who want to make certain changes to LLVM/Clang in order to propagate them into a particular Swift release branch or Swift's master branch (because we want to keep swift/* branches the same as apple/* branches, except for LLDB contents). The focus on improved tooling will help when integrating LLVM/Clang changes into specific Swift release branches (e.g. a tool can create a PR against a particular apple/stable/* branch once you specify the destination Swift release branch). When it comes to integrating new LLVM/Clang changes into master, I think it would be better if master was rebranched more often instead. The monorepo and the improved auto merging infrastructure should help to make rebranches easier.

That being said, I truly hope that these scripts don't become required .

They won't be required, no. They will be however very strongly preferred for some very specific workflows like auto merging upstream monorepo branches into the downstream monorepo branches.

The current branching structure, where swift/master-next sits atop llvm-project's master (via apple/master) and integrates all the Apple and Swift changes for all subprojects works great for our purposes. That's something that'll be preserved in subsequent prototypes, correct?

Thanks for the feedback! Yes, it will be preserved. We are also planning on setting up an automerger from apple/master into swift/master-next that will build llvm-project together with Swift's master-next branch and will block the merge until the build issues are resolved.

1 Like

The automerger blocking merges until build issues are resolved would be amazing! Thanks :)

I would like to provide an additional update to this thread. Today, the following two new repositories have been created on github.com/apple:

The new llvm-project-v2 repository replaces the previous WIP downstream monorepo (https://github.com/apple/llvm-project-v1). The convention that's used for branches in the new repository is the same as in llvm-project-v1. If you've been using llvm-project-v1 for your work, please migrate to v2 before the end of next week. At the end of next week we'll stop updating llvm-project-v1, and will archive it soon after.

Hi again,

We will stop updating the v1 monorepo (https://github.com/apple/llvm-project-v1) today. Please use https://github.com/apple/llvm-project-v2 instead. We will delete https://github.com/apple/llvm-project-v1 and https://github.com/apple/llvm-project-v1-split at the end of next week.

Is it intentional that the swift/master-next automerge account (apple-llvm-mt) has spaces in its email address? E.g. look at 43c812bf60fc3e6c9e82ae9083908c2ce5affb05 ... the author email is mt @ apple-llvm.

Terms of Service

Privacy Policy

Cookie Policy