However, these methods are not cost-effective: you either pay for resources you don't use, or you risk not having enough resources to handle the

The engineering behind autoscaling with HashiCorp's Nomad on a global serverless platform

submited by
Style Pass
2024-05-14 09:30:04

However, these methods are not cost-effective: you either pay for resources you don't use, or you risk not having enough resources to handle the load.

Fortunately, there is a third way: horizontal autoscaling. Horizontal autoscaling is the process of dynamically adjusting the number of instances of a service based on the current load. This way, you only pay for the resources you use, and you can handle load spikes without any manual intervention.

As a cloud platform, we run our users' services inside Instances. Instances are the runtime units that execute a project's code. When an application is deployed on the platform, the code is built and packaged into a container image and run as an Instance. Under the hood, Instances are Firecracker microVMs, which are running on bare metal servers.

We already covered the basics of autoscaling in the public preview announcement . To recap, on our platform, three criteria are used to determine if an application should be scaled up or down: the number of requests, the memory usage, and the CPU usage. As a user, you define a target for at least one of these criteria, and the autoscaler will scale up or down the application to keep the usage as close as possible to the target. These criteria must be configured per region, so you can have different autoscaling policies in different regions.

Leave a Comment