As an industry, we have a developer experience problem when it comes to generative AI APIs. The REST API model is designed for a largely synchronous w

Asynchronous AI: Why Event Callbacks Are the Future of GenAI APIs

submited by
Style Pass
2024-07-08 10:30:14

As an industry, we have a developer experience problem when it comes to generative AI APIs. The REST API model is designed for a largely synchronous world where responses return in milliseconds. However, GenAI APIs sit in front of technologies that can take tens of seconds in most cases and, in some scenarios, minutes to respond.

We're starting to see the limits of just how far we can stretch synchronous tooling. And workarounds like extending timeouts beyond reasonable limits to support a transactional request/response paradigm can only get us so far.

So, rather than trying to shoehorn synchronous technologies and paradigms into fundamentally asynchronous products, it's time for a rethink.

With transactional APIs, we've come to expect sub-second response times. Even as far back as 2008, Amazon found that every extra 100ms of latency reduced their profit by 1%. More recent research suggests that sites that load in one second have three times the conversion rate of sites that load in five seconds.

And that makes sense. A delay of 100ms is the limit at which people feel a UI is responding instantaneously. Break that barrier, and people will start to lose focus. Go from one second to three seconds of latency, and the probability of a user bouncing increases by 32%.

Leave a Comment