Last updated: Oct 09, 2024
To determine which models might work well for your project, consider model attributes, such as: license, pretraining data, model size, and how the model has been fine-tuned. After you have a short list of models that best fit your use case, systematically test the models to see which ones consistently return the desired results.
Model attribute | Considerations |
---|---|
License | In general, each foundation model comes with a different license that limits how the model can be used. Review model licenses to make sure you'll be able to use a given model for your planned solution. All open-source models available in watsonx.ai have the Apache 2.0 license. |
Supported programming languages | Not all foundation models work well for programming use cases. If you are planning to create a solution that summarizes, generates, or otherwise processes code, review which programming languages were included in a model's pretraining data sets and fine-tuning activities to determine if that model is a fit for your use case. |
Supported natural languages | Many foundation models work well in English only. But some model creators have taken care to include multiple languages in their model's pretraining data sets, to fine-tune their model on tasks in different languages, and to test their model's performance in multiple languages. If you plan to build a solution for a global audience or a solution that performs translation tasks, look for models that were created with multilanguage support in mind. |
Model size | A larger foundation model will generally produce more successful results than a smaller model, other factors being equal. However, for a given task, a model that has been fine-tuned for that task might outperform a model of the same size or larger that has not been fine-tuned for that task. |
Fine-tuning | After being pretrained, many foundation models are fine-tuned for specific tasks, such as: classification, information extraction, summarization, responding to instructions, answering questions, or participating in a back-and-forth dialog chat. A model that has been fine-tuned on tasks similar to your planned use will perform better with zero-shot prompts than models that have not been fine-tuned in a way that fits your use case. One way to improve results for a fine-tuned model is to structure your prompt in the same format as prompts in the data sets used to fine-tune that model. |
Cost | Depending on your use case and the results you need, a model that is less expensive to use might get the job done as well as a more expensive model, if you spend some time experimenting with your prompt. |
Parent topic: Models available with watsonx.ai