Glossary

What is Retrieval-Augmented Generation (RAG)?

RAG (Retrieval-Augmented Generation) is an AI technique that retrieves relevant information from a knowledge source and uses it to ground the model's response — producing answers backed by your own documents instead of relying only on what the model memorized during training.

Retrieval-Augmented Generation, or RAG, addresses one of the biggest weaknesses of large language models: they can confidently make things up. A vanilla LLM only knows what it learned during training, which may be outdated, incomplete, or simply wrong for your specific context. RAG fixes this by pairing the model with an external knowledge source — your documents, your database, your company wiki — and forcing it to ground its answers in retrieved content.

The mechanics: when a user asks a question, the system first searches an index of your content (usually a vector database that stores embeddings — numeric representations of the meaning of each chunk of text). It retrieves the most relevant passages, attaches them to the prompt, and sends both to the language model. The model then generates a response that synthesizes information from those retrieved passages, rather than from its own potentially-stale memory.

RAG is the foundation of any serious enterprise AI deployment. It enables citations (so you can verify where an answer came from), reduces hallucinations dramatically, lets you update the AI's knowledge by simply re-indexing your documents, and keeps your private data private — your content never trains the model, it just informs answers at runtime.

How Definable uses RAG

Definable Knowledge Bases is a full RAG layer. You connect data sources (PDFs, SQL databases, URLs, Git repos, Google Drive, Notion), Definable chunks and embeds the content, and every Assistant reply can be grounded in your knowledge base with inline citations. Tenant-isolated, never used to train models, with bring-your-own embedding model support.

Frequently asked questions

How is RAG different from fine-tuning?

Fine-tuning updates the model's weights to bake in new knowledge — expensive, slow, and the model still can't cite sources. RAG keeps the model frozen and supplies fresh information at query time — cheap, fast, updatable, and the model can cite exactly where each answer came from.

Does RAG eliminate AI hallucinations?

It dramatically reduces them but doesn't eliminate them entirely. A well-built RAG system grounds answers in retrieved sources, so the model has less reason to invent. But if the retrieval is bad or the model ignores the retrieved context, hallucinations can still slip through. Good RAG systems include verification and citation requirements.

What is an embedding model?

An embedding model converts text into a vector — a list of numbers — that represents the meaning of the text. Similar meanings produce similar vectors. RAG uses embeddings to find which chunks of your documents are most relevant to a user's question.

Try Definable AI free

50+ AI models, 1000+ integrations, knowledge bases, photo & video studios — all in one platform.