High availability and disaster recovery in Watson Query
Watson Query plans have excellent availability characteristics with a 99.99% service level agreement.
High availability disaster recovery (HADR) on the Watson Query Enterprise plan is provided by using HADR replication. In addition to HADR for further redundancy, backups of the service are replicated across regions by default and are taken every 24 hours and saved for a minimum of 14 days.
- Each HADR system consists of three nodes that are located in different independent availability zones within the same region.
- Watson Query supports Dallas, Frankfurt, London, and Tokyo regions.
- The primary node processes read and write transactions. One of the standby nodes is replicated synchronously, which means each transaction is committed on at least two nodes before it is successful and provides a recovery point objective of 0. This standby node is ready to take over write processing if any failure or maintenance event occurs. The other standby node is asynchronously replicated and assumes the role of the synchronous node during a failure or maintenance event. Even if you experience an entire data center failure or maintenance event, you still have an HA system that is replicated between the surviving data centers.
-
During failover events, you can expect between 10-20 seconds during which transactions are restricted. Your client can seamlessly fail over by using automatic client reroute (ACR) along with appropriate retry logic for any failed transactions. It can take up to 5 minutes for all connections to be successful reestablished and processing.
-
The failover is managed for you by IBM®. IBM monitors the health of your server, fail over and fail back as needed, including rolling updates and scaling to keep uptime as high as possible.
-
Backups are only used for restore by the service if a complete regional or service instance loss occurs without possibility of recovery. Backups are managed and restored by IBM personnel if needed. When you experience a complete loss of a service instance, you might be asked to provision a new service instance before the data is restored.