The retrieval precision metric measures the quanity of relevant contexts from the total of contexts that are retrieved.
Metric details
Retrieval precision is a retrieval quality metric for generative AI quality evaluations that measures the quality of how a retrieval system ranks relevant contexts. Retrieval quality metrics are calculated with LLM-as-a-judge models.
Scope
The retrieval precision metric evaluates generative AI assets only.
- Types of AI assets: Prompt templates
- Generative AI tasks: Retrieval Augmented Generation (RAG)
- Supported languages: English
Scores and values
The retrieval precision metric score indicates whether the retrieved contexts are relevant. Higher scores indicate that the retrieved contexts are relevant to the question. Lower scores indicate that the retrieved contexts are not relevant to the question.
- Range of values: 0.0-1.0
- Best possible score: 1.0
- Ratios:
- At 0: None of the retrieved contexts are relevant.
- At 1: All of the relevant contexts are relevant.
Settings
- Thresholds:
- Lower bound: 0
- Upper bound: 1
Parent topic: Evaluation metrics