One of the major challenges faced by large companies customizing open-source projects for their specific use cases is staying aligned with upstream changes. At PlanetScale, we encountered this issue firsthand while managing our private modifications alongside the continual evolution of the open-source Vitess project.
In the early days, when our private changes were relatively small, we used a straightforward approach to maintain the diff. Each week, a GitHub Action running on a cron would cherry-pick all private changes onto the latest updates from the main branch. While this method worked initially, it quickly became unsustainable as PlanetScale’s private diff grew with our increasing sophistication.
The situation became even more complicated when we decided to align with stable release versions of Vitess rather than the latest code from the main branch. This introduced an additional challenge: maintaining the private diff not just on the main branch but also across multiple release branches.
One of our early observations was that when cherry-picking private commits onto multiple release branches, we frequently encountered the same conflicts repeatedly. To address this, we developed a tool that could sequentially process all relevant commits and replay them on top of the open-source (OSS) branch. This tool, aptly named git-replay, could store how specific conflicts were resolved during cherry-picks and reuse that information to resolve similar conflicts in the future.