High Availability on Linux: The service IP address, part 2

In this second part of the article the service IP address is configured and the use of virtual network interface “dummy0” is explained.

Configuring the service IP address

As already mentioned in the first part, using “keepalived” -which implements the VRRP protocol-, we will assign an IP address to a node as starting point; and if it fails, that IP address will be automatically assigned to the slave node.

The configuration file for “keepalived” is “/etc/keepalived/keepalived.conf”. Each server will have its own file, very similar to its pair.

For the first node we execute the following:

For the second node, we execute:

The highlighted lines are those that differ depending on the node.

Consider this file generically:

Lines 1 to 3 are a global definitions block. In this cases we have only used “router_id” to identify the node when it exchanges VRRP information. Other functionality not shown here is notify by mail when a status change occurs.

Lines 5 to 29 defines an “VRRP instance”. An instance is a node group which communicates together exchanging status information following rules we specify.

Inside the VRRP instance we define the initial status (line 6) each server has: MASTER, which will have the service IP address assigned by default; or BACKUP, which will have it if the MASTER node doesn’t answer or notifies that some rule is not being followed.

When a node in BACKUP status can’t communicate with the MASTER node, or the MASTER node informs the rules of the VRRP instance are not followed, the BACKUP node will take the MASTER status and the MASTER node will take the BACKUP states,

The information will be exchanged with a symmetric cipher over “eth0” network interface (line 7) to avoid that a malicious server alters the exchange of status information (lines 12 to 15).

VRRP is a protocol who exchanges status information using multicast network packets. As we only have two nodes, we will setup a direct communication between them –unicast– (lines 21 to 23) and we don’t use the network multicast traffic. Another reason to do this is the possibility that switch where nodes are connected has some multicast traffic filter.

In lines 17 to 19 we indicate that this VRRP instance must assign to “dummy0” network interface the service IP address “192.0.2.100/24”. This is where the IP (or IPs)  the node will have assigned if it’s in MASTER status are defined.

In lines 25 to 28 we define a VRRP rule to this instance: Check the “eth0” and “dummy0” network interfaces. If any of these interfaces aren’t available for whatever reason, then a notification is emitted and the BACKUP node assigns the service IP addresses to himself.

Finally we’ll restart the “keepalived” service on each node:

We may verify the correct “keepalived” start with the configuration shown above using the “journalctl” command:

For the first node we’ll see something like this:

We can see in the last two lines how a new master election is forced and how this node, “server-1”, is chosen as that (MASTER STATE).

We can also see that network interface “dummy0” has the IP address “192.0.2.100” assigned as it’s the MASTER:

On the second node we see:

In the last two lines we see this node gets an “higher priority” advice (from node “server-1”) and then it takes the BACKUP state.

On the second node there won’t be any IP address assigned to the “dummy0” network interface as being that node as backup:

Verifying functionality

With “keepalived” running and nodes defined as master and slave, its time to run the necessary tests to verify that the behavior is the expected.

We’ll use the “ping” command from some system on the same 192.0.2.100/24 subnet to check the service IP address availability. It can be done from the backup node, but is preferable to use an external system other than the nodes that are being configured:

The first real test we will do is verify what happens when the master node becomes unavailable due a reboot or power failure; and the second one consists on network failure simulation.

In case of active node reboot

Connected to both nodes, on the master we’ll reboot the server and on the second we check what happens:

The backup node becomes the master and the service IP address is assigned to it:

The service IP address continues accesible from other systems on the network. The following output shows what perception a client had  during the process since the first node failed until the secondary took the control and started giving service:

As it can be observed, the virtual IP address (the service IP address) was not available over a 25 seconds period.

In case of physical network failure

Is possible to simulate the loss of a network interface with the following command:

We can check the slave node detects such failure on the master and assigns the service IP address to its “dummy0” interface:

As in the previous case, the service IP address is reassigned:

The current service status is like the following image:


If disabling the network interface on the second node by:

Having no enabled interface (either the first node or the second), the service would no longer be accessible.

The “dummy0” virtual interface usage

Instead using the physical network adapter “eth0” En lugar de utilizar la interfaz de red física “eth0”, we decided to use “dummy0”, a virtual interface.

An advantage presented by the use of virtual interface is the ability to disable it without losing connection to the computer at the time of administration.

We can also disable the network interface “dummy0” as appropriate for administrative tasks, such as updating server software. Once disabled, traffic will flow automatically to the slave node while these tasks are completed without affecting users.

Final remarks

For simplicity along these two articles related to the service IP address, we used a single physical network interface for both server management and providing the service.

It is advisable to use multiple physical network interfaces, each for a task. Ideally the physical network interface “eth0” will be used to provide the service, while another network interface “eth1” will be used to administer the server and where “keepalived” exchanges the status of both nodes.

Similarly, for simplicity, we used only one connection for each task. Environments where high availability is essential must have double network links using “bonding” or “teaming” with separate network cards; in the future I will post about it.

We have used a virtual network interface “dummy0” for flexibility, so we can disable it and continue operating on the server for maintenance. It also allows us to force the slave node take control disabling that interface in the master.

Esta entrada también está disponible en: esEspañol (Spanish)

Leave a Reply

Your email address will not be published. Required fields are marked *