Deploying custom foundation models
You can upload and deploy a custom foundation model for use with watsonx.ai inferencing capabilities.
In addition to working with foundation models that are curated by IBM, you can now deploy your own foundation models. After the models are deployed, create prompts that inference the custom models from the Prompt Lab.
Deploying a custom foundation model provides the flexibility for you to implement the AI solutions that are right for your use case.
If you are using a model from a third-party provider, it is best to get the model directly from the model builder. One place to find new models is Hugging Face, a repository for open source foundation models used by many model builders.
Watch this video to see how to deploy a custom foundation model.
This video provides a visual method to learn the concepts and tasks in this documentation.
Importing custom foundation models to a deployment space
The process for deploying a foundation model and making it available for inferencing includes tasks that are performed by a ModelOps engineer, and a Prompt engineer.
The ModelOps engineer must first upload the model to cloud storage (either internal or external). To deploy a custom foundation model, the ModelOps engineer must create or promote a foundation model asset into the deployment project or space context.
After the model is deployed to production, the Prompt engineer can prompt the custom foundation model from the Prompt Lab or watsonx.ai API.
The following graphic represents a flow of tasks that are typically performed by a ModelOps engineer and a Prompt engineer:
Preparing the model
To prepare the model, the ModelOps engineer must perform the following tasks:
Deploying a custom foundation model
After preparing the model, the ModelOps engineer must perform the following tasks:
Prompting the custom foundation model
When the model is deployed, the Prompt engineer can start prompting the custom foundation model from the Prompt Lab or watsonx.ai API. See Using the custom foundation model for generating prompt output.
Next steps
Learn more
- Developing generative AI solutions with foundation models (watsonx.ai)
- Billing rates for custom foundation models
Parent topic: Deploying foundation model assets