High availability deployment
- Updated: 2022/08/10
High availability deployment
To support Automation 360 in your data center, configure an high availability (HA) cluster. Follow your company methods and procedures for implementing your data center cluster.
- Why only odd number of nodes are supported
- A cluster comprising even number of nodes can merge inconsistencies but can
result in a split-brain condition where the cluster has no majority and
cannot resolve transactions, which might result in data inconsistencies.
Split-brain condition is a known limitation of clustering systems that can
be caused by network issues including latency.
Deployment configurations with odd numbered nodes can help avoid split-brain issues and are recommended for Automation 360 deployments.
- Quorum
- The nodes determine which transaction can be processed through voting on each transaction. The number of votes constituting a majority of the nodes in the cluster is referred to as a quorum and determines how many nodes have to vote for or confirm a transaction before it can be processed.
- Fault Tolerance
- Fault tolerance in terms of node failure is determined by how many nodes can fail before a quorum or majority of nodes is not available to vote on the validity of any transaction. Fault tolerance is optimized with an odd number of nodes in the cluster because a majority in odd-numbered clusters is a lower number than in even-numbered clusters
- Supported Configurations
- A cluster with three or higher odd number of nodes prevents the split-brain
condition or inconsistencies due to network issues, while giving the higher
scale and availability.
Number of nodes in cluster Majority (quorum) Fault tolerance (node failures) Support 3 2 1 Certified 5 3 2 Contact Automation Anywhere support 7 and so on 4 and so on 3 and so on Contact Automation Anywhere support - Multi-availability Zone/Multi-datacenter Configurations
- When going for a multi-zone deployment in further enhance the availability,
say with 3 nodes deployment, we recommend that you have each Control Room in separate availability zones. Deployments with
more than 3 nodes, spread these deployments across at least 3 availability
zones. One thing we need to be concerned about in these setups is the
latency between the zones/providers. The nodes in a high
availability cluster must be deployed in the same region.
In terms of the cloud providers, we currently support 3 major cloud providers - Amazon Web Services, Google Cloud Platform, and Microsoft Azure.
Note: In a multi-node environment, if a node goes down, operations such as bot deployments and schedules, triggers, and work items in queues on
that node will be adversely affected.
Tip: For information on how to backup and restore files to
recover a Control Room High Availability cluster in case of failure,
see Backing up and restoring a Control Room High Availability Cluster (A-People login required).