Managing the watsonx.ai Runtime service endpoint

Last updated: Mar 27, 2025
Managing the watsonx.ai Runtime service endpoint

You can use IBM Cloud connectivity options for accessing cloud services securely by using service endpoints. When you provision a watsonx.ai Runtime service instance, you can choose if you want to access your service through the public internet, which is the default setting, or over the IBM Cloud private network.

How you access service endpoints depends on the Cloud platform you are using.

Accessing endpoints on IBM Cloud

You can use the Service provisioning page to choose a default endpoint from the following options:

For more information, refer to IBM Cloud service endpoints.

Public network

You can use public network endpoints to connect to watsonx.ai Runtime service instance on the public network. Your environment needs to have internet access to connect.

Private network

You can use private network endpoints to connect to your IBM watsonx.ai Runtime service instance over the IBM Cloud Private network. After you configure your watsonx.ai Runtime service to use private endpoints, the service is not accessible from the public internet.

Private URLs for watsonx.ai Runtime

Private URLs for watsonx.ai Runtime for each region are as follows:

Using IBM Cloud service to enable private endpoints

Follow these steps to enable private network endpoints on your clusters:

  1. Use IBM Cloud CLI to enable your account to use IBM Cloud service endpoints.
  2. Provision a watsonx.ai Runtime service instance with private endpoints.

Provisioning with service endpoints (Dallas, Frankfurt, Tokyo, London)

You can provision a watsonx.ai Runtime service instance with service endpoint by using IBM Cloud UI or IBM Cloud CLI.

Provisioning a service endpoint with IBM Cloud UI

To configure the endpoints of your IBM watsonx.ai Runtime service instance, you can use the Endpoints field on the IBM Cloud catalog page. You can configure a public, private, or a mixed network.

Configure endpoint from the service catalog

IBM Cloud CLI

If you provision an IBM watsonx.ai Runtime service instance by using the IBM Cloud CLI, use the command-line option service-endpoints to configure the watsonx.ai Runtime endpoints. You can specify the value public (the default value), private, or public-and-private:

ibmcloud resource service-instance-create <service instance name> pm-20 <plan_name> <region>  --service-endpoints <private/public/public-and-private>

For example:

ibmcloud resource service-instance-create wml-instance pm-20 standard us-south -p  --service-endpoints private

or

ibmcloud resource service-instance-create wml-instance pm-20 standard us-south --service-endpoints public-and-private

Provisioning a service endpoint (Sydney and Toronto)

To provision a service endpoint for a watsonx.ai Runtime instance in either the Sydney or Toronto region, you must request access to a Private Catalog. After the request is approved, you can share the endpoint as a Virtual Private Endpoint.

Requesting access to a private catalog

To request access to a Private Catalog, follow these steps:

  1. Use IBM Cloud CLI to enable your account to use IBM Cloud service endpoints.
  2. Contact IBM Support and submit a request, asking the watsonx.ai Runtime team provide you with access to a Private Catalog. You must supply your IBM Cloud accountID with the request.
  3. When the watsonx.ai Runtime team provides access to the Private Catalog to customers account ID, you can view the completed request and catalog details from Manage>Catalogs>Share requests in the IBM Cloud console. You can then create a virtual private endpoint gateway.
  4. Select Virtual Provate Endpoint as the catalog type. For example: Sharing a request from the IBM Cloud console
  5. Follow the steps to create a Virtual private endpoint gateways for VPC. Use the following as the Private Catalog display names for the Sydney and Toronto data centers:
    • SYDNEY : mcsp-wml-sydprod
    • TORONTO : mcsp-wml-torprod

Reviewing a share request

To review the share request from the IBM Cloud CLI, use the following command:

ibmcloud catalog account get-approval-list-source --object-kind vpe --approval-state pending
{
"first": "/api/v1-beta/shareapproval/vpe/access/source/pending?limit=100",
"limit": 100,
"resource_count": 1,
"resources": [
{
"_id": "-acct-fc3acf288b1b451e8cb981b2b9423b14:apr-acct:ba083c5877a64197a36b55d259812dfa:vpe:account",
"_rev": "1-6703f335f8ca2330aa22a7e542700d58",
"account": "fc3acf288b1b451e8cb981b2b9423b14",
"account_type": 3,
"approval_state": "pending",
"created": "2025-02-26T01:15:21.513749288-05:00",
"id": "-acct-fc3acf288b1b451e8cb981b2b9423b14",
"target_account": "ba083c5877a64197a36b55d259812dfa",
"target_kind": "vpe"
}
]
}

Approving a share request

For approving share requests, you can use the IBM Cloud UI or CLI. If the option to approve share requests is not available in the UI, you can use IBM Cloud CLI to approve the request.

To approve a share request from the IBM Cloud CLI, use the following command:

ibmcloud catalog account set-approval-state-source --object-kind vpe --approval-state approved -account-ids "<account ID>"

Verifying approval

To verify that the customer has accepted the share request, use the following command:

ibmcloud catalog account get-approval-list-source --object-kind vpe --approval-state approved
{
"first": "/api/v1-beta/shareapproval/vpe/access/source/approved?limit=100",
"limit": 100,
"resource_count": 1,
"resources": [
{
"_id": "<account id>",
"_rev": "2-93907d1b7d449c1a82914dfde604f316",
"account": "fc3acf288b1b451e8cb981b2b9423b14",
"account_type": 3,
"created": "2025-02-26T01:15:21.513749288-05:00",
"id": "<account id>",
"target_account": "ba083c5877a64197a36b55d259812dfa",
"target_kind": "vpe"
}
]
}

This command returns a list of approved requests, including the account ID and target account ID.

Accessing private endpoints from AWS

To access private endpoints from AWS you use a Service Distributed Network Load Balancer (SDNLB) to set up a and manage a Virtual Private Endpoint (VPE). After creating a service ID and specifying an allow list, you create a catalog on the IBM Cloud console for listing and sharing private endpoints.

  1. Follow the steps in Creating and working with service IDs to create a service ID for your SDNLB.

  2. Create an allow list for the ServiceID by providing the following information to ibmcloud-service-dnlb channel:

    Service name: MCSP-Production
     Service contact name and email: (Name, email@*.ibm.com)
     Service account name: (Account where this service is needed)CloudRock Production's Account
     Service account ID: xxx (Account ID)
     Service ID: sdnlb-mcsp-prod (ServiceId-xxxx)
     Env: production (cloud.ibm.com)or (test.cloud.ibm.com)
    
  3. Create an APIKey for the serviceID and grant access, as follows:

        VPC Infrastructure Services All Editor      
        Resource group only Default resource group  Viewer      
        All Account Management services All Viewer      
        Resource group only CloudRock resource group    Viewer      
        Catalog Management  All Viewer, Editor, Publisher
    
  4. Create a private catalog object, as follows:

    {
     "dns_domain": "private.au-syd.ml.cloud.ibm.com",
     "endpoint_type": "vpe",
     "fully_qualified_domain_names": [
         "private.au-syd.ml.cloud.ibm.com"
     ],
     "service_crn": "crn:v1:bluemix:public:mcsp-wml-sydprod:au-syd:::endpoint:private.au-syd.ml.cloud.ibm.com"
    }
    
  5. Create the secret cluster for the SDN in the kube-system namespace. The secret cluster must be named sdnlb-config in namespace kube-system and reside in a file named sdnlb.toml. Create ./sdnlb.toml so that it contains the following information:

     account_id = "user@xx.xxx.com"
     service_id = "service-account-ID"
     service_apikey = "service-account-apiKey"
    
     kubectl create secret generic -n kube-system sdnlb-config --from-file ./sdnlb.toml
    
  6. Create the LoadBalancer for each specific deployment, following the steps in Example 1: Single port SDNLB. For example, this configuration file shows a loadbalancer set up to access a server in a Sydney data center.

    create SDNLB service 
    apiVersion: v1
    kind: Service
    metadata:
    labels:
       app: router
       type: sdnlb
    name: private-ingress-sdnlb-mcspsydprod
    namespace: openshift-ingress
    annotations:
      service.kubernetes.io/ibm-load-balancer-cloud-provider-enable-features: "service-dnlb"
      service.kubernetes.io/ibm-load-balancer-cloud-provider-ip-type: "private"
      service.kubernetes.io/ibm-load-balancer-cloud-provider-vpc-service-crn: "crn:v1:bluemix:public:mcsp-wml-sydprod:au-syd:::endpoint:private.au-syd.ml.cloud.ibm.com"
    spec:
      type: LoadBalancer
      selector:
        ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default
     ports:
       - name: http
         port: 80
       - name: https
         port: 443
    
  7. Use this command to check if the NodePort specified in the previous step is available.

    kubectl describe service  private-ingress-sdnlb -n openshift-ingress
    

    If the NodePort is not available, use this command to open the port:

    kubectl get service private-ingress-sdnlb -n openshift-ingress –o wide
    
  8. Create the service for a private endpoint. For example:

    apiVersion: v1
    kind: Service
    metadata:
      name: wmlproxyserviceprivateonly
      namespace: swml-prod-default
    spec:
      ports:
        - port: 443
          targetPort: 8443
    selector:
     app: wmlproxyprod
    type: ClusterIP
    
  9. Create a route based off wmlproxyserviceprivateonly service for the private endpoint. For example:

    oc create route passthrough --service wmlproxyserviceprivateonly --hostname private.au-syd.ml.cloud.ibm.com -n swml-prod-default
    
    
  10. Create a VPE Gateway on the watsonx.ai Runtime cluster resource group. For example:

    VPE Gateway example

  11. After you create the VPE, confirm that the private endpoints are working from the cluster. Open a pod and access the private endpoint, as follows:

    [wmluser@wmlproxyprod-85bd985475-g564h /]$ curl -k https://private.au-syd.ml.cloud.ibm.com/heartbeat
    {"message": "Proxy is running!"}
    

Parent topic: Deploying and managing AI assets