The assertion that “corporate organizations are distributed systems” hovers between being a tautology—since it is literally true—and a hot tak

Organizations are distributed systems

submited by
Style Pass
2024-12-30 19:30:09

The assertion that “corporate organizations are distributed systems” hovers between being a tautology—since it is literally true—and a hot take, given that human-driven systems are inherently more complex than those in computer science. Reflecting on my experience leading teams at Google and Vercel, I’ve come to see that while the analogy is far from perfect, it is undeniably useful.

Approval processes are like database transactions #Take, for example, approval processes. Approval processes are analogous to database transactions:When you get all the approvals, you ship.When any one approval is denied, you don’t ship.At Google, one of my roles was serving as the “quality approver” for Google Search launches. Only after every other stakeholder—such as legal, privacy, and security—had approved a launch would I be asked to give my approval. This was a formal email-based process, which meant there was no good way to speed it up as the requester besides nagging the approvers. Some of my peers might respond to approval requests with questions after a week or two. And, again, this approval process wouldn't even start until after everyone else had already signed off on the launch. Personally, I aimed to respond within 30 minutes, often multitasking during my 8+ hours of daily meetings. But process durations of 2-4 weeks were not unusual as you were asking very busy folks for work they weren't directly invested in. (For a slightly more spicy take on this process, see my LinkedIn post)Clearly, such an approval process makes it difficult to ship with any kind of velocity. On the other hand, there are genuine risks associated with shipping to billion-user systems. Almost every approval process has a “bloody path” of postmortems behind it, established with the best intentions of avoiding past mistakes.If we accept that many of the checks in the shipping process are necessary, can we still do better? This is where viewing the organization as a distributed system can help.Optimizing team throughput via async IO #Some of the highest-velocity teams at Google understood that shipping can be approached like asynchronous I/O. Waiting on an approval? Work on the next task in the meantime. Of course, having more tasks in flight increases management and cognitive overhead. Many of us struggle with context-switching and multitasking, so the async I/O approach to increasing shipping velocity has its limits.As CTO of Vercel, I manage a scaleup that is deeply committed to shipping velocity. However, we’re reaching a scale where many of the same concerns driving the approval pipeline at Google—such as legal, privacy, and security—are top of mind as well and require expert input. So, can you just ship things?

Optimistic locking for approval processes #Let’s return to the approval process and database transaction analogy. One technique to speed up transaction processing is optimistic locking. Interestingly, there’s a corporate equivalent to optimistic locking: moving from approvals to vetoes.Vetoes are non-blocking by default. While a veto can delay your launch, in the absence of one, you ship without waiting for explicit approval. Organizationally, this shifts approvers from gatekeepers to active participants who intervene only when necessary.In this model, if something ships with a legal issue that a traditional approver would have caught, who is responsible? The responsibility still lies with the legal expert—provided the team shipping the change followed all transparency procedures.For this approach to work, those responsible for vetoing launches must know about them in the first place. Thus, an organization must foster a strong culture of communication around planned changes, ensuring no one is ever surprised. This is beneficial for many other reasons as well.Communication is key #At Vercel, each product area publishes a weekly executive summary detailing progress, decisions, updates to the product roadmap, and upcoming launches. It is the responsibility of legal, privacy, and security experts to monitor these updates and dig deep on any concerns. For the vast majority of changes, these experts can quickly identify that there are no issues, minimizing the time spent.By contrast, the traditional “pessimistic” approach to launch approvals introduces significant overhead at every step, even when the outcome is a trivial approval. And there is always the temptation to add just one more approval step as more risks are uncovered. Giving the experts veto power, but requiring them to excercise it with discretion, turns them from passive passengers to active drivers of progress.In summary, techniques that improve the speed of distributed systems appear to be helpful in enhancing corporate organizations. Many organizations can significantly boost velocity while continuing to manage risk responsibly by adopting processes that deviate slightly from conventional approaches. Looking back at the last 30 years of my engineering career, we've made so much progress on understanding and managing distributed systems yet the way we run orgnaizations really hasn't improved much. Maybe the "optimistic locking" analogy is the only one that works for people organizations, but my bet would be that there is more out there that we can adopt for our teams.

At Google, one of my roles was serving as the “quality approver” for Google Search launches. Only after every other stakeholder—such as legal, privacy, and security—had approved a launch would I be asked to give my approval. This was a formal email-based process, which meant there was no good way to speed it up as the requester besides nagging the approvers. Some of my peers might respond to approval requests with questions after a week or two. And, again, this approval process wouldn't even start until after everyone else had already signed off on the launch. Personally, I aimed to respond within 30 minutes, often multitasking during my 8+ hours of daily meetings. But process durations of 2-4 weeks were not unusual as you were asking very busy folks for work they weren't directly invested in. (For a slightly more spicy take on this process, see my LinkedIn post)

Leave a Comment