0 / 0
Improving a prompt-tuning experiment
Last updated: Sep 23, 2024
Improving a prompt-tuning experiment

Make changes to your prompt-tuning experiment to improve results.

A sample Python notebook named Use watsonx.aito tune IBM granite-13b-instruct-v2 model with Car Rental Company customer satisfaction document is available that contains code for prompt tuning foundation models in watsonx.ai. The sample notebook has sections for optimizing the experiment parameters and for inferencing the tuned model. For more information, see Tuning a foundation model by using a Python notebook.

Adjusting tuning parameters

When a tuning experiment run is finished, a loss function graph is displayed. A loss function measures the difference between predicted and actual results with each training run. A successful tuning experiment results in a loss function that has a downward-sloping curve.

Where the measure of loss drops and levels off is called the convergence. You want the curve to drop, or converge, and the tail end of the curve to reach as close as possible to 0 because it means that the predicted results are as similar as possible to results from the training data.

If the loss function for your experiment resembles a mountain range with multiple peaks, the loss never converges, or the loss converges but remains at a number much higher than zero, adjust your tuning parameters.

You can configure the parameter values in the Tuning Studio or use the sample notebook. The sample notebook has steps that help you find the best values to use for your tuning parameters, which is sometimes called hyperparameter optimization. For more information, see Using the notebook to optimize tuning parameter values.

The following table describes common tuning experiment outcomes and lists actions that might improve the outcomes.

Table 1: Actions for addressing common tuning experiment flaws
Loss function graph Cause Actions to try
Loss curve is flat and never drops
Loss curve is flat and never drops.
Tuning is not improving the results by much. • Increase the learning rate (by 10x) so that the experiment makes more drastic adjustments to the prompt vector.
Loss curve drops but settles too high
Loss curve drops but the tail settles at too-high a number.
Tuning is not improving the results by as much as it could. • Increase the learning rate (by 5x) so that the experiment makes bigger adjustments to the prompt vector.
Loss curve drops but doesn't get as close to zero as possible
Loss curve drops, then decreases steadily but never levels off.
Training ended before the model was fully tuned. • Increase the number of epochs to give the model more time to learn.
Loss curve goes up and then back down but never goes low enough
Loss curve goes up and then drops, but never gets low enough.
Training is unstable because the high learning rate is causing the prompt vector to change too much. • Decrease the learning rate (by 10x) to make smaller adjustments to the prompt vector.

For more information about how to change tuning parameters and rerun a tuning experiment, see Tuning a foundation model.

Addressing data quality problems in tuned model output

You know that you're done tuning a model when you can submit zero-shot prompts to the tuned model and get back outputs you expect.

The following table describes some common training data quality issues and lists actions that you can take to address them.

Table 2: Actions for addressing training data flaws
Outcome Cause Actions to try
Tuned model outputs don't match the content and format of output examples from training data Not enough training data examples Increase the training data size.
Tuned model outputs are incomplete The tuning process isn't using the examples that you think it's using Watch the length of your training data input and output examples. The maximum input tokens allowed is 256 and the maximum output tokens allowed is 128. Examples that are longer than the maximum allowed length are truncated.
Missing classification labels in a classification task Not enough examples of each class type for a classification task Add more examples of each class type that you want the model to recognize.
Missing text extractions in an extraction task Not enough examples of each entity type for an extraction task Add more examples of each entity type that you want the model to recognize.
Inaccurate class labels or entity type text extractions Insufficient context to choose the correct class or entity type • Add an equal number of examples for each type.
• Review the classes or entities that you want the model to identify or extract to make sure that they are distinct from one another.

Parent topic: Evaluating tuning experiment results

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more