0 / 0
Answer similarity evaluation metric
Last updated: Feb 03, 2025
Answer similarity evaluation metric

The answer similarity metric measures how closely generated text matches reference answers or expected ouptuts to determine the quality of your model performance.

Metric details

Answer similarity is an answer quality metric for generative AI quality evaluations that evaluates the quality of generative AI model answers. Answer quality metrics are calculated with LLM-as-a-judge models.

Scope

The answer relevance metric evaluates generative AI assets only.

  • Types of AI assets: Prompt templates
  • Generative AI tasks: Retrieval Augmented Generation (RAG)
  • Supported languages: English

Scores and values

The answer similarity metric score indicates the similarity between the generated answer and the reference answer. Higher scores indicate that the answer is more similar to the reference output.

  • Range of values: 0.0-1.0
  • Best possible score: 1.0

Settings

  • Thresholds:
    • Lower bound: 0
    • Upper bound: 1

Parent topic: Evaluation metrics