High availability and disaster recovery
IBM Watson OpenScale is highly available within multiple IBM Cloud locations, such as Dallas and Washington, DC. However, recovering from potential disasters that affect an entire location requires planning and preparation.
You are responsible for understanding your configuration, customization, and usage of the service. You are also responsible for being ready to re-create an instance of the service in a new location and to restore your data in any location. See How do I ensure zero downtime? for more information.
High Availability
Watson OpenScale is deployed and available on us-south data centers with multiple zone routings (MZR) on three availability zones. At any time, if one zone is not available, the system continues to be available in other availability zones. The global-load balancer and DNS server routes traffic to available zones without any user interruption.
Data that is stored in PostgreSQL databases is also highly available and exists in multiple availability zones. However, it is the customer’s responsibility to back up data in support of a disaster recovery plan so that services can be re-created.
Model evaluation traffic is balanced across multiple zones in a region. Each zone is a data center in the same region.
Compose databases, such as PostgreSQL and distributed etc
directory (etcd) databases are backed up periodically to ensure high availability. If disaster strikes, the operations team can recover service within Recovery Point Objective
(RPO).
IBM Cloud offers in-region data redundancy that enables high availability protection. IBM provides automatic Data Replication for client databases that contain training or custom model data at no additional cost. Replication is completed across in-region availability zones within IBM Cloud data centers.
Back up and Restore
Clients are responsible for backing up and restoring their own data, including training or custom model data and any Client-generated custom models. For client backup and restore instructions, see the IBM Cloud documentation.
Disaster Recovery
In-region Business Continuity is completed by using the automatic replication across in-region availability zones within IBM Cloud data centers. Clients are responsible for multi-region Disaster Recovery. The responsibilities include backing up, restoring and syncing of their own security policies, training, and custom model data, and any client-generated custom models. In addition, the client is responsible for routing and balancing traffic across the regions. For client backup and restore instructions, see the IBM Cloud documentation.
Next steps
Parent topic: Watson OpenScale