There is a huge disconnect between the crazy AI hype threads on Twitter and the current state of the technology. “BabyAGI just built my IKEA table, made me a cappuccino and got my cat pregnant! 143 things it could do for you.”
Perhaps you have built your own Langchain-based agent for funsies, and have encountered some of the challenges I’m about to describe. I feel you. Earlier this year I was experimenting with a Langchain agent that would take a request (e.g. “give me a summary of crypto news from the past week”), search the internet, scrape pages and email me a summary. I could publish the code, but Bard already does it better:
There were many calls to GPT3.5 involved: first, understand the request and make plans. Then, run queries to find relevant sources. After that, parse the query output. Extract links. Scrape them. Summarize them. Keep track of the original links so I could double-check the sources (to make sure the model didn’t make any of it up). Because inference is slow and the OpenAI api isn’t always fast or reliable, the entire process took about 10-15 minutes.
Every step can fail in an unexpected number of ways. Failures compound. Let’s say there are 10 steps involved, and each one has a success rate of 90%. This means that the entire process has approximately one third chances of succeeding. What complicates this is that a failure isn’t always obvious until you see the whole result. For example, sometimes the search would pick up a newly refreshed page with news from last year. It would keep researching them, and fifteen minutes later I would get a summary of crypto news from Early 2022 (Bitcoin at 45k!).