Remember 2016 ? Yes, that was long before the pandemic, and LLMs were not on the radar of many yet. What was on the radar instead were “serverless functions”, and those things were presented, as Matt Dugan puts it, “as the undeniable future of infrastructure” [1] in the tech industry. And indeed, every major cloud provider came out with his derivate of a function service — AWS Lambda, Google Cloud Functions, IBM Open Whisk, Microsoft Functions and so on and so forth.
To recap what serverless computing is, let’s look at the Wikipedia definition: “Serverless computing is a cloud computing execution model in which the cloud provider allocates machine resources on demand, taking care of the servers on behalf of their customers” [2]. When, back in the day, I would have taken that definition and would have matched it against the service catalog of IBM Cloud, I would have gotten two matches: IBM Cloud Functions and IBM Cloud Foundry. Looking at them from higher up, they had quite some similarities — a developer pushed some code, somehow that code magically morphed into a container and that container was executed. Looking closer there were differences, of course — the scaling model, the pricing, the way how the container came into existence — but essentially, someone wanted to run some code, as a container, in a serverless way. As a senior technical leader for IBM Cloud, the question to ask was: Why do we build and maintain more than one service for that ?
Some time later, AI training scenarios became more present in the minds of people, product management came around and asked for a “serverless batch job as a service” capability. And that requirement made sense, because neither the characteristics of functions (short-running and memory constraint) nor of applications (long-running, http/s serving) were really a good fit for batch jobs which were potentially longer running, had high cpu and memory demands and are usually not http/s serving.