Short for retrieval augmented generation, the technology has been heralded by everyone from Nvidia's Jensen Huang to Intel's savior-in-chief Pat Gelsi

From RAGs to riches: A practical guide to making your local AI chatbot smarter

submited by
Style Pass
2024-06-15 23:00:02

Short for retrieval augmented generation, the technology has been heralded by everyone from Nvidia's Jensen Huang to Intel's savior-in-chief Pat Gelsinger as the thing that's going to make AI models useful enough to warrant investment in relatively pricey GPUs and accelerators.

The idea behind RAG is simple: Instead of relying on a model that's been pre-trained on a finite amount of public information, you can take advantage of an LLM's ability to parse human language to interpret and transform information held within an external database.

Critically, this database can be updated independently of the model, allowing you to improve or freshen up your LLM-based app without needing to retrain or fine-tune the model every time new information is added or old data is removed.

But before we demo how RAG can be used to make pre-trained LLMs such as Llama3 or Mistral more useful and capable, let's talk a little more about how they work.

Leave a Comment