High availability cluster configuration
To support Automation Anywhere in your data center, configure an high availability (HA) cluster. Follow your company methods and procedures for implementing your data center cluster.
High availability deployments with clustering
Automation 360 can be deployed in high availability mode where multiple Control Room instances are clustered together using a distributed load balancer. The cluster configuration provides operational fault tolerance because the clustered Control Room can continue to function even when one or more nodes in the cluster experience a failure.
The cluster functions by coordinating transactions between all participating Control Room nodes. The nodes determine which transaction can be processed through voting on each transaction. The number of votes constituting a majority of the nodes in the cluster is referred to as a quorum and determines how many nodes have to vote for or confirm a transaction before it can be processed.
Fault tolerance in terms of node failure is determined by how many nodes can fail before a quorum or majority of nodes is not available to vote on the validity of any transaction. Fault tolerance is optimized with an odd number of nodes in the cluster because a majority in odd-numbered clusters is a lower number than in even-numbered clusters, as shown in the following table:
|Number of nodes in cluster||Majority (quorum)||Fault tolerance (node failures)|
A cluster with three or more node is much less susceptible to a split-brain condition or inconsistencies due to network issues because the quorum configuration is enforced by default.
High availability components
To support HA and DR for Automation Anywhere, configure the selected components in your data center for HA.
Cluster components - A cluster is a set servers (nodes) that are connected by physical cables and software. In an HA environment, these clusters of servers are allowed to be in the same physical data center.
- Cluster group (role) - Group of clustered services that fail-over together and are dependent on each other.
- Host - The cluster machine that is hosting the services.
- Node - A generic term for a machine in a cluster.
- Primary node - The active node in the cluster. The machine where the production activities run.
- Secondary node - The machine that is designated as the target in the event of a fail-over. The secondary node is a passive duplicate of the primary node.
- Server - The machine in the cluster installed with the server operating system.
- Application and service failures—affecting application software and essential services.
- Site failures in multisite organizations—caused by natural disasters, power outages, or connectivity outages.
- System and hardware failures—affecting hardware components such as CPUs, drives, memory, network adapters, and power supplies.
This ability to handle failure allows clusters to meet two requirements that are typical in most data center environments:
- High availability - the ability to provide end users with access to a service for a high percentage of time and reduces unscheduled outages.
- High reliability - the ability to reduce the frequency of system failure.