Scaling a deployment

When you create an online deployment for a model, function, or Shiny app from a deployment space or programmatically, a single copy of the asset is deployed by default. To increase scalability and availability, you can increase the number of copies (replicas) by editing the configuration of the deployment. Additional replicas allow for a larger volume of scoring requests.

Deployments can be scaled in the following ways:

  • Update the configuration for a deployment in a deployment space.
  • Programmatically, using the Watson Machine Learning Python client library, or the Watson Machine Learning REST APIs.

Increase the number of replicas of an online deployment from a space

  1. Click to open the Deployment tab of your deployment space.
  2. Click Edit configuration on the action menu for the deployment name.
  3. Change the number of replicas and save your change.
    Scale a deployment

Tip: You can also update the number of replicas from the information sheet for the deployment.

  1. Click the deployment name to view the details.
  2. Click the Edit icon in the information sheet to edit the number of copies.


Updating a deployment configuration

Increase the number of replicas of a deployment programmatically

To view or run a working sample of scaling a deployment programmatically, you can view a notebook example.

Python example

This example uses the Python client to set the number of replicas to 3.

change_meta = {
                client.deployments.ConfigurationMetaNames.HARDWARE_SPEC: {
                                       "name":"S", 
                                       "num_nodes":3}
            }
 
client.deployments.update(<deployment_id>, change_meta)

Note that the HARDWARE_SPEC value includes a name because the API requires a name or an id to be provided. However, this argument is actually disregarded for online deployments.

REST API example

curl -k -X PATCH -d '[ { "op": "replace", "path": "/hardware_spec", "value": {  "name": "S", "num_nodes": 2  } } ]' <Deployment end-point URL> 

You must specify a name for the hardware_spec value, but the argument is not applied for scaling.