This blog is the second in a series of three on search using DuckDB. It builds on knowledge from the first blog on AI-powered search, which shows how

Developing a RAG Knowledge Base with DuckDB

submited by
Style Pass
2024-05-06 15:00:07

This blog is the second in a series of three on search using DuckDB. It builds on knowledge from the first blog on AI-powered search, which shows how relevant textual information is retrieved using cosine similarity.

Different facets of our work and our lives are documented in different places, from note-taking apps to PDFs to text files, code blocks, and more. AI assistants that use large language models (LLMs) can help us navigate this mountain of information by answering contextual questions based on it. But how do AI assistants even get this knowledge?

Retrieval Augmented Generation (RAG) is a technique to feed LLMs relevant information for a question based on stored knowledge. A knowledge base is a commonly used term that refers to the source of this stored knowledge. In simple terms, it’s a database that contains information from all the documents we feed into our model.

One common method of storing this data is to take documents and chunk up the underlying text into smaller parts (e.g., a group of four sentences) so these ‘chunks’ can be stored along with their vector embeddings. These blocks of text can later be retrieved based on their cosine similarity. At its simplest, a RAG can retrieve relevant information as text and feed it to an LLM, which in turn will output an answer to a question. For example, if we asked a question, we would retrieve the top 3 relevant chunks of text from our knowledge base and feed them to an LLM to generate an answer. Lots of research has been done in the field, from pioneering new, better ways to chunk information, store it, and retrieve it based on a variety of techniques. That said, information retrieval in RAG is typically based on semantic similarity.

Leave a Comment