ServiceRouter: Hyperscale and Minimal Cost Service Mesh at Meta

submited by
Style Pass
2024-03-29 01:00:08

Many tech companies have distributed services deployed in the cloud in regions around the world. The systems often depend on each other, meaning that they need to determine which dependencies are where (service discovery), and route the requests across the network (often performed via a “service mesh. Inter-system communication also needs to be highly reliable and load balanced.

While there are well known open source systems for routing traffic (e.g. Linkerd, Envoy, and Istio), there are a few interesting components of ServiceRouter:

To build a source of truth for routing decisions (which the paper calls the Routing Information Base (RIB)), ServiceRouter gathers information from the cluster manager about which services are running where. Importantly, ServiceRouter can also handle stateful services (discussed in my previous paper review on ShardManager) - for example, some services will store a specific subset of data on a specific server, so knowing the server alone is not enough.

As an input to the Routing Information Base, ServiceRouter also gathers information that allows it to make decisions about how services talk to each other across clusters (for example, monitoring the latency of traffic from North America to South America).

Leave a Comment