Input token count evaluation metric

Last updated: Mar 05, 2025

The input token count metric calculates the total, average, minimum, maximum, and median input token count across multiple scoring requests during evaluations.

Metric details

Input token count is a token count metric for model health monitor evaluations that calculates the number of tokens that are processed across scoring requests.

Scope

The input token count metric evaluates generative AI assets only.

Generative AI tasks:
- Text summarization
- Text classification
- Content generation
- Entity extraction
- Question answering
- Retrieval Augmented Generation (RAG)
Supported languages: English

Evaluation process

To calculate the input token count metric, you must specify the input_token_count field when you send scoring requests with the Python SDK to calculate the input and output token count metrics as shown in the following example:

request = {
            "fields": [
                "comment"
            ],
            "values": [
                [
                    "Customer service was friendly and helpful."
                ]
            ]
        }
response = {
            "fields": [
                "generated_text",
                "generated_token_count",
                "input_token_count",
                "stop_reason",
                "scoring_id",
                "response_time"
            ],
            "values": [
                [
                    "1",
                    2,
                    73,
                    "eos_token",
                    "MRM_7610fb52-b11d-4e20-b1fe-f2b971cae4af-50",
                    3558
                ],
                [
                    "0",
                    3,
                    62,
                    "eos_token",
                    "MRM_7610fb52-b11d-4e20-b1fe-f2b971cae4af-51",
                    3778
                ]
            ]
        }

from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord    
        client.data_sets.store_records(
            data_set_id=payload_data_set_id, 
            request_body=[
                PayloadRecord(
                    scoring_id=<uuid>,
                    request=request,
                    response=response,
                    response_time=<response_time>,
                    user_id=<user_id>). --> value to be supplied by user 
            ]
        )

Parent topic: Evaluation metrics

Was the topic helpful?

0/1000