We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top

Do Enormous LLM Context Windows Spell the End of RAG?

submited by
Style Pass
2024-05-11 01:00:04

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

Generative AI’s (GenAI) iteration speed is growing exponentially. One outcome is that the context window — the number of tokens a large language model (LLM) can use at one time to generate a response — is also expanding rapidly.

Google Gemini 1.5 Pro, released in February 2024, set a record for the longest context window to date: 1 million tokens, equivalent to 1 hour of video or 700,000 words. Gemini’s outstanding performance in handling long contexts led some people to proclaim that “retrieval augmented generation (RAG) is dead.” LLMs are already very powerful retrievers, they said, so why spend time building a weak retriever and dealing with RAG-related issues like chunking, embedding and indexing?

The increased context window started a debate: With these improvements, is RAG still needed? Or might it soon become obsolete?

Leave a Comment