How to set up a Pacemaker cluster for high availability Linux

Learn how to leverage Red Hat Enterprise Linux Pacemaker for High Availability

Posted: May 28, 2021 by Pratheek Prabhakaran

How to setup a RHEL Pacemaker cluster — Photo by **Brett Sayles** from **Pexels**

As a sysadmin, it is imperative to facilitate high availability at all possible levels in a system's architecture and design and the SAP environment is no different. In this article, I discuss how to leverage Red Hat Enterprise Linux (RHEL) Pacemaker for High Availability (HA) of SAP NetWeaver Advanced Business Application Programming (ABAP) SAP Central Service (ASCS)/Enqueue Replication Server (ERS).

[ You might also like: Quick start guide to Ansible for Linux sysadmins ]

SAP has a three-layer architecture:

Presentation layer—Presents a GUI for interaction with the SAP application
Application layer—Contains one or more application servers and a message server
Database layer—Contains the database with all SAP related data (for example, Oracle)

In this article, the main focus is on the application layer. Application server instances provide the actual data processing functions of an SAP system. Based on the system requirement, multiple application servers are created to handle the load on the SAP system. Another main component in the application layer is the ABAP SAP Central Service (ASCS). The central services comprise two main components—Message Server (MS) and Enqueue Server (ES). The Message Server acts as a communication channel between all the application servers and handles the distribution of the load. The Enqueue Server controls the lock mechanism.

High Availability in Application and Database Layers

You can implement high availability for application servers by using a load balancer and having multiple application servers handle the requests from users. If an application server crashes, only the users connected to that server are impacted. Isolate the crash by removing the application server from the load balancer. For high availability of ASCS, use Enqueue Replication Server (ERS) to replicate the lock table entries. In the database layer, you can set up native database replication between primary and secondary databases to ensure high availability.

Introduction to RHEL High Availability with Pacemaker

RHEL High Availability enables services to failover from one node to another seamlessly within a cluster without causing any interruption in the service. ASCS and ERS can be integrated into a RHEL Pacemaker cluster. In the event of an ASCS node failure, the cluster packages shift to an ERS node where the MS and ES instances will continue to run without bringing the system to a halt. In the event of an ERS node failure, the system is not impacted as MS and ES will continue to run on the ASCS node. In this case, the ERS instance will not run on ASCS node as ES and ERS instances need not run on the same node.

RHEL Pacemaker configuration

There are two ways to configure ASCS and ERS nodes in the RHEL Pacemaker cluster—Primary/Secondary and Standalone. The Primary/Secondary approach is supported in all RHEL 7 minor releases. The Standalone approach is supported in RHEL 7.5 and newer. RHEL recommends the usage of Standalone approach for all new deployments.

Cluster configuration

The broad steps for the cluster configuration include:

Install Pacemaker packages on both nodes of the cluster.
```
# yum -y install pcs pacemaker
```
Create the HACLUSTER user ID with.
```
# passwd hacluster
```
In order to use pcs to configure the cluster and communicate among the nodes, you must set a password on each node for the user ID hacluster, which is the pcs administration account. It is recommended that the password for user hacluster be the same on each node.

Enable and start the pcs services.

# systemctl enable pcsd.service; systemctl start pcsd.service

Authenticate pcs with hacluster user
Authenticate the pcs user hacluster for each node in the cluster. The following command authenticates user hacluster on node1 for both of the nodes in a two-node cluster (node1.example.com and node2.example.com).
```
# pcs cluster auth node1.example.com node2.example.com
Username: hacluster
Password:
node1.example.com: Authorized
node2.example.com: Authorized
```
Create the cluster.
Cluster nwha is created using node1 and node2:
```
# pcs cluster setup --name nwha node1 node2
```
Start the cluster.
```
# pcs cluster start --all
```
Enable the cluster to auto-start after reboot.
```
# pcs cluster enable --all
```

Creating resources for ASCS and ERS instances

Now that the cluster is set up, you need to add the resources for ASCS and ERS nodes. The broad steps include:

Install resource-agents-sap on all cluster nodes.
```
# yum install resource-agents-sap
```

Configure shared filesystem as resources managed by the cluster.
The shared filesystem, such as /sapmnt, /usr/sap/trans, /usr/sap/SYS are added as resources auto-mounted on the cluster using the command:

# pcs resource create <resource-name> Filesystem device=’<path-of-filesystem>’ directory=’<directory-name>’ fstype=’<type-of-fs>’

Example:

# pcs resource create sid_fs_sapmnt Filesystem device='nfs_server:/export/sapmnt' directory='/sapmnt' fstype='nfs'

Configure resource group for ASCS.
For ASCS node, the three required resource groups are as follows (assuming instance ID of ASCS is 00):
- Virtual IP address for the ASCS
- ASCS filesystem (for example, /usr/sap/<SID>/ASCS00)
- ASCS profile instance (for example, /sapmnt/<SID>/profile/<SID>_ASCS00_<hostname>)
Configure resource group for ERS.
For ERS node, the three required resource groups are as follows (assuming instance ID of ERS in 30):
- Virtual IP address for the ERS
- ERS filesystem (for example, /usr/sap/<SID>/ERS30)
- ERS profile instance (for example, /sapmnt/<SID>/profile/<SID>_ERS30_<hostname>)
Create the constraints.
Set the ASCS and ERS resource group constraints for the following:
- Restrict both resource groups from running on same node
- Make ASCS run on the node where ERS was running in the event of a failover
- Maintain start/stop order sequence
- Ensure cluster is started only after required filesystems are mounted

Cluster Failover Testing

Assuming that ASCS runs on node1 and ERS runs on node2 initially. If node1 goes down, ASCS and ERS both shift to node2. Due to the constraint defined, ERS will not run on node2.

When node1 comes back up, ERS will shift to node1 while ASCS remains on node2. Use the command #pcs status to check the status of the cluster.

RHEL Pacemaker Failover ERS back to node1, ASCS stays on node2

[ A free course for you: Virtualization and Infrastructure Migration Technical Overview. ]

Wrap up

RHEL Pacemaker is a great utility to attain a highly available cluster for SAP. You can also perform fencing by configuring STONITH to ensure data integrity and avoid resource utilization by a faulty node in the cluster.

For all the automation enthusiasts, you can make use of Ansible to control your Pacemaker cluster by using the Ansible module pacemaker_cluster. As much as you protect your systems, take care of yourself and stay safe.

Setting up a Linux cluster with Keepalived: Basic configuration

In this second of three Linux HA cluster articles, you'll explore the fundamentals of Keepalived installation and configuration.

Using Keepalived for managing simple failover in clusters

This Linux high availability cluster introductory article walks you through the basic protocol that underpins Keepalived, which is a software implementation of VRRP on Linux.