0 / 0
Average precision metric
Last updated: Feb 03, 2025
Average precision metric

The average precision metric evaluates whether all of the relevant contexts are ranked higher or not by calculating the mean of the precision scores of relevant contexts.

Metric details

Average precision is a retrieval quality metric for generative AI quality evaluations that measures the quality of how a retrieval system ranks relevant contexts. Retrieval quality metrics are calculated with LLM-as-a-judge models.

Scope

The average precision metric evaluates generative AI assets only.

  • Types of AI assets: Prompt templates
  • Generative AI tasks: Retrieval Augmented Generation (RAG)
  • Supported languages: English

Scores and values

The average precision metric score indicates how well relevant contexts are ranked. Higher scores indicate that the relevant contexts are ranked higher. Lower scores indicate that the relevant contexts are not ranked lower.

  • Range of values: 0.0-1.0
  • Best possible score: 1.0
  • Ratios:
    • At 0: None of the retrieved contexts are relevant.
    • Over 0: All of the relevant contexts are ranked higher.

Settings

  • Thresholds:
    • Lower bound: 0
    • Upper bound: 1

Parent topic: Evaluation metrics