Vercel claims it's slashed AWS Lambda costs by up to 95 percent by reusing idle instances that would otherwise rack up charges while waiting on slow external services like LLMs or databases.
For the uninitiated, AWS Lambda is Amazon's serverless compute platform handy for short bursts of work, but costly for long-running or latency-prone tasks. Each request runs in its own environment and gets billed for the full duration, even when idle. At a small scale, the idle-time burn might be negligible, but at billions of invocations, it adds up fast.
The AWS Lambda design is that "for each concurrent request, Lambda provisions a separate instance of your execution environment," according to the cloud giant. Pricing is based on the number of function requests, the duration of each request, and the memory allocated to the function, where memory is between 128MB and 10,240 MB. No function can run for longer than 15 minutes. There is an open-source tool that measures execution time and cost for a function in order to optimize Lambda configuration.
This approach works well for functions that do all their processing on the Lambda instance, but it is wasteful if they spend a lot of time waiting for remote services to complete. Tom Lienard, Vercel software engineer, has posted about how the company found a solution, apparently by accident. Vercel is the home of Next.js, a React-based framework that is also recommended by the React team as the best implementation of React Server Components (RSC).