You can use foundation models in IBM watsonx.ai to generate factually accurate output grounded in information in a knowledge base by applying the retrieval-augmented generation pattern.
This video provides a visual method to learn the concepts and tasks in this documentation.
Video chapters
[ 0:08 ] Scenario description
[ 0:27 ] Overview of pattern
[ 1:03 ] Knowledge base
[ 1:22 ] Search component
[ 1:41 ] Prompt augmented with context
[ 2:13 ] Generating output
[ 2:31 ] Full solution
[ 2:55 ] Considerations for search
[ 3:58 ] Considerations for prompt text
[ 5:01 ] Considerations for explainability
Providing context in your prompt improves accuracy
Foundation models can generate output that is factually inaccurate for a variety of reasons. One way to improve the accuracy of generated output is to provide the needed facts as context in your prompt text.
Example
The following prompt includes context to establish some facts:
Aisha recently painted the kitchen yellow, which is her favorite color.
Aisha's favorite color is
Unless Aisha is a very famous person who's favorite color has been mentioned in many online articles included in common pre-training data sets, without the context at the beginning of the prompt, no foundation model could reliably generate the correct completion of the sentence at the end of the prompt.
If you prompt a model with text that includes fact-filled context, then the output the model generates is more likely to be accurate. For more details, see: Generating factually accurate output
The retrieval-augmented generation pattern
You can scale out the technique of including context in your prompts by leveraging information in a knowledge base.
The retrieval-augmented generation pattern involves three basic steps:
- Search for relevant content in your knowledge base
- Pull the most relevant content into your prompt as context
- Send the combined prompt text to the model to generate output
The origin of retrieval-augmented generation
The term retrieval-augmented generation (RAG) was introduced in this paper: Retrieval-augmented generation for knowledge-intensive NLP tasks
"We build RAG models where the parametric memory is a pre-trained seq2seq transformer, and the non-parametric memory is a dense vector index of Wikipedia, accessed with a pre-trained neural retriever."
In that paper, the term "RAG models" refers to a specific implementation of a retriever (a specific query encoder and vector-based document search index) and a generator (a specific pre-trained, generative language model.) However, the basic search-and-generate approach can be generalized to use different retriever components and foundation models.
Knowledge base
The knowledge base could be any collection of information-containing artifacts, such as:
- Process information in internal company wiki pages
- Files in GitHub (in any format: Markdown, plain text, JSON, code)
- Messages in a collaboration tool
- Topics in product documentation
- Text passages in a database like Db2
- A collection of legal contracts in PDF files
- Customer support tickets in a content management system
Retriever
The retriever could be any combination of search and content tools that reliably returns relevant content from the knowledge base:
- Search tools like IBM Watson Discovery
- Search and content APIs (GitHub has APIs like this, for example)
- Vector databases (such as chromadb)
Generator
The generator component could use any model in watsonx.ai, whichever one suits your use case, prompt format, and content you are pulling in for context.
Examples
The following examples demonstrate applying the retrieval-augmented generation pattern.
Example | Description | Link |
---|---|---|
Simple introduction | This sample notebook uses a small knowledge base and a simple search component to demonstrate the basic pattern | Introduction to retrieval-augmented generation |
Real world example | The watsonx.ai documentation has a search-and-answer feature that can answer basic what-is questions using the topics in the documentation as a knowledge base | Answering watsonx.ai questions using a foundation model |
Parent topic: Foundation models