High Availability on FreeBSD: The service IP address, part 2

In this second part of the article, we configure the service IP address.

Configuring the service IP address

As already mentioned in the first part, using the “carp” kernel module -which implements the protocol with the same name-, we will make an IP address assigned to two nodes, so that if one of them fails, the other continues answering requests.

The configuration consists of an extra line in the /etc/rc.conf file of each node.

For the first node, we will run the following:

For the second, we will run:

Let’s examine this in a generic way:

The first line indicates the use of an IP alias. This means that the ’em0′ NIC will have an additional IP to the one already configured as the primary IP (192.0.2.5/24 in our case for node 1, and 192.0.2.6/24 for the second one). Everything that goes between quotes are the parameters that this alias will take.

The second line states that we are going to use the “inet” family (IPv4), and if it is not specified, it is the default value. Another option is “inet6” for the IPv6 protocol.

In the third line, the virtual host identifier is established. Each service IP address must have the same value in all the nodes where it is configured. If we had more service addresses, this field should change. The allowed values are from 1 to 255.

The fourth line establishes the authentication password with which the participating nodes will be identified in a “vhid“. This key does not set any encryption.

In the fifth line, the IP address that we want as the service IP address is established.

Finally, and only for the secondary node, the value of “advskew” is set to 100. This value introduces a delay when the node is “announced” as a CARP node, modifying its order of precedence; and it is useful when we want to force a primary node automatically, or there are multiple secondary nodes.

At this time the change must be applied in each node.

For the first node we will execute:

And, in the second:

We can verify the correct functioning in several ways.

Using the “ifconfig em0” command in each node:

If we observe the last line, in the first server it shows “CARP: MASTER” and, in the second, “CARP: BACKUP”.

Another option, where we will also see more information such as the choice of the “MASTER” node, state transitions, etc., is the / var / log / messages file of each node:

Verifying functionality

With the configuration already done and activated and, the nodes defined in master and slave, it is time to make the necessary tests to verify that the behavior is appropriate.

We’ll use the “ping” command from some system on the same 192.0.2.100/24 subnet to check the service IP address availability:

The first real test we will do is verify what happens when the master node becomes unavailable due to a reboot or power failure, and the second one consists of network failure simulation.

In case of active node reboot

Connected to both nodes, on the master we’ll reboot the server, and on the second we check what happens:

The backup node becomes the master and the service IP address is assigned to it:

The service IP address continues accessible from other systems on the network. The following output shows what perception a client had  during the process since the first node failed until the secondary took the control and started giving service:

As can be seen, the virtual IP address has not been available for about 3 seconds.

In case of physical network failure

Is possible to simulate the loss of a network interface with the following command:

We can verify that the slave node detects this fault in the master:

As in the previous case, the second server becomes MASTER:

El estado actual del servicio sería como muestra la siguiente imagen:

The current service status is like the following image:

If we disable the network interface on the second node by:

Having no enabled interface (either the first node or the second), the service would no longer be accessible.

Forcing a node as primary

We may want a particular node to always be the primary node.

For this task, we can use an automatic configuration that consists of adding a line to the /etc/sysctl.conf file of the node that we want it to be:

If we do not want to restart the node at this time, we will activate the change in the following way:

We can also temporarily set a node as primary using the following command in the current MASTER node:

Final remarks

For simplicity along these two articles related to the service IP address, we used a single physical network interface for both server management and providing the service.

It is advisable to use multiple physical network interfaces, each for a task. Ideally the physical network interface “em0” will be used to provide the service, while another network interface “em1” will be used to administer the server and where CARP exchanges the status of both nodes.

Similarly, for simplicity, we used only one connection for each task. On production environments, where high availability is essential, must have second network links using “link aggregation” with separate network cards; in the future, I will post about it.

High Availability on FreeBSD: The service IP address, part 1

NOTE: This entry is an adaptation of the Linux version.

In a previous post, we saw what is high availability and what we address when a consumer tries to access a service: The closest possible availability time to 100%.

This entry describes what is the service IP address -entry point to them- and how to setup it using two servers.

To implement the things described under these lines is necessary to have two physical or virtual servers and FreeBSD installed on them.

In order to avoid excessively long entries, I splitted this one into two parts. This first part is an introduction and preparation of systems, while the second shows how to setup the proper service IP address.

The service IP address

When we access a service, the connection is made to an IP address, in a direct way (192.0.2.100) or through a hostname (www.example.com).

Suppose we want to access to a web page (http://www.example.com) and that its associated IP address is “192.0.2.100”. This IP address, through which the page is accessible, is called the “service IP address”.

The objective is to have this IP address always available. For consumers the perception should be like the following image:

To achieve it, having a minimum of two servers, one will have the service IP address assigned and the other will be waiting just in case the first one fails to take it.

If server “server-1” weren’t functional, then server “server-2” would use the service IP address. The service will be degraded -one component in a fail state- but operative and accesible for the consumer.

If no server is available, the service cannot be given as the service IP address cannot be assigned to any of them.

The consumer can’t access the service. It perception will vary:

The method which allows a server to use a service IP address previously assigned to another one when it becomes unavailable is called “IP failover“.

Multiple standard protocols which implement IP failover methods exist. For this post, we will be using CARP.

Required setup

Before we start we must have the necessary material prepared.

We will use two servers called “server-1” and “server-2”. I have used FreeBSD 11.2.

Both servers are connected through “em0” network interface to a switch and belong to 192.0.2.0/24 subnet. The switch is connected to a router whose IP address is 192.0.2.1 and acts as a gateway.

The following diagram shows the equipment interconnection:

We will note both servers information for further reference:

System preparation

In FreeBSD all you have to do is load the kernel module.

We will instruct the system to load it each time it boots adding an entry to /boot/loader.conf file.

And then we load the module to use it without a reboot:

At this point we will verify the following:

– The “carp” module is loaded:

Once everything has been checked, we can start configuring the service IP address, but this will be on the next entry on this serie.

How I became a buffer pay subscriber

I wanted to share the story behind why I converted to be a payment subscriber from a company I think we can learn a lot from.

For a couple of days, buffer has a new subscriber in its “awesome” plan. It’s nothing extraordinary that a user pays for some services with which he is happy, but I want to share how I have reached the point of becoming a payment user.

A few years ago, when I started to think about LoadFront, I wanted our customers and potential customers feel confident about us. We wanted to be as transparent as possible and create a corporate culture that attracts by its philosophy and way of doing things.

Seeking inspiration in this I came to Buffer. I do not know how, possibly through the early SlideShare versions of their “Buffer culture“.

I began to follow them, without registering, by reading about their employee benefits, the transparency portal, the income panel or its blog. It’s to a great extent the same vision I have about what I want LoadFront to be.

Then I started to follow its CEO and co-founder, Joel Gascoigne, on Twitter (@joelgascoigne), and shortly after I started to follow buffer too (@buffer), where they publish a lot of content oriented to social media and to get more reach and visibility.

After some time I finally signed up for around a year ago, and there I left it. I did not use it until a few months ago when I wanted to see how it worked and I tried it with an account for each social network, as the free plan only lets you have one active at a time.

I did not like that limitation, although understanding the business model, it’s normal. I did not know what I could get by writing a “tweet” or publication there and immediately send it to my profiles since from the applications I could do the same. I used it a bit and forget it.

Recently I logged in to update my password and saw an option to try the “awesome plan”. I activated it and liked it, I liked it a lot.

I did not use the schedule features “tweets”, but it was very helpful with this blog to publish on multiple social networks at the same time the first technical articles I wrote. Facebook, LinkedIn as a profile and in several groups and Twitter at the same time, very comfortable. Also, I started to see the reach I had with I have published and how many people were interested in such content by the clicks and links (it’s also in the free plan!).

Months passed where I stopped using it again, and now, I am a subscriber of the awesome plan.

Why? It is not only the value that it gives, but how it is given: A transparent company, with a good service and a great culture behind is what really convinced me to become a paid user.

I’ll try a few months before jumping into an annual subscription. Over this time I want to see if my publications get more reach using their analytics tools and even develop my personal brand thanks to it. This time I’m sure I’ll be using the scheduled posts feature.

 

 

Have I had many English mistakes in this post? Help me improve my English by correcting me!

High Availability on Linux: The service IP address, part 2

In this second part of the article the service IP address is configured and the use of virtual network interface “dummy0” is explained.

Configuring the service IP address

As already mentioned in the first part, using “keepalived” -which implements the VRRP protocol-, we will assign an IP address to a node as starting point; and if it fails, that IP address will be automatically assigned to the slave node.

The configuration file for “keepalived” is “/etc/keepalived/keepalived.conf”. Each server will have its own file, very similar to its pair.

For the first node we execute the following:

For the second node, we execute:

The highlighted lines are those that differ depending on the node.

Consider this file generically:

Lines 1 to 3 are a global definitions block. In this cases we have only used “router_id” to identify the node when it exchanges VRRP information. Other functionality not shown here is notify by mail when a status change occurs.

Lines 5 to 29 defines an “VRRP instance”. An instance is a node group which communicates together exchanging status information following rules we specify.

Inside the VRRP instance we define the initial status (line 6) each server has: MASTER, which will have the service IP address assigned by default; or BACKUP, which will have it if the MASTER node doesn’t answer or notifies that some rule is not being followed.

When a node in BACKUP status can’t communicate with the MASTER node, or the MASTER node informs the rules of the VRRP instance are not followed, the BACKUP node will take the MASTER status and the MASTER node will take the BACKUP states,

The information will be exchanged with a symmetric cipher over “eth0” network interface (line 7) to avoid that a malicious server alters the exchange of status information (lines 12 to 15).

VRRP is a protocol who exchanges status information using multicast network packets. As we only have two nodes, we will setup a direct communication between them –unicast– (lines 21 to 23) and we don’t use the network multicast traffic. Another reason to do this is the possibility that switch where nodes are connected has some multicast traffic filter.

In lines 17 to 19 we indicate that this VRRP instance must assign to “dummy0” network interface the service IP address “192.0.2.100/24”. This is where the IP (or IPs)  the node will have assigned if it’s in MASTER status are defined.

In lines 25 to 28 we define a VRRP rule to this instance: Check the “eth0” and “dummy0” network interfaces. If any of these interfaces aren’t available for whatever reason, then a notification is emitted and the BACKUP node assigns the service IP addresses to himself.

Finally we’ll restart the “keepalived” service on each node:

We may verify the correct “keepalived” start with the configuration shown above using the “journalctl” command:

For the first node we’ll see something like this:

We can see in the last two lines how a new master election is forced and how this node, “server-1”, is chosen as that (MASTER STATE).

We can also see that network interface “dummy0” has the IP address “192.0.2.100” assigned as it’s the MASTER:

On the second node we see:

In the last two lines we see this node gets an “higher priority” advice (from node “server-1”) and then it takes the BACKUP state.

On the second node there won’t be any IP address assigned to the “dummy0” network interface as being that node as backup:

Verifying functionality

With “keepalived” running and nodes defined as master and slave, its time to run the necessary tests to verify that the behavior is the expected.

We’ll use the “ping” command from some system on the same 192.0.2.100/24 subnet to check the service IP address availability. It can be done from the backup node, but is preferable to use an external system other than the nodes that are being configured:

The first real test we will do is verify what happens when the master node becomes unavailable due a reboot or power failure; and the second one consists on network failure simulation.

In case of active node reboot

Connected to both nodes, on the master we’ll reboot the server and on the second we check what happens:

The backup node becomes the master and the service IP address is assigned to it:

The service IP address continues accesible from other systems on the network. The following output shows what perception a client had  during the process since the first node failed until the secondary took the control and started giving service:

As it can be observed, the virtual IP address (the service IP address) was not available over a 25 seconds period.

In case of physical network failure

Is possible to simulate the loss of a network interface with the following command:

We can check the slave node detects such failure on the master and assigns the service IP address to its “dummy0” interface:

As in the previous case, the service IP address is reassigned:

The current service status is like the following image:


If disabling the network interface on the second node by:

Having no enabled interface (either the first node or the second), the service would no longer be accessible.

The “dummy0” virtual interface usage

Instead using the physical network adapter “eth0” En lugar de utilizar la interfaz de red física “eth0”, we decided to use “dummy0”, a virtual interface.

An advantage presented by the use of virtual interface is the ability to disable it without losing connection to the computer at the time of administration.

We can also disable the network interface “dummy0” as appropriate for administrative tasks, such as updating server software. Once disabled, traffic will flow automatically to the slave node while these tasks are completed without affecting users.

Final remarks

For simplicity along these two articles related to the service IP address, we used a single physical network interface for both server management and providing the service.

It is advisable to use multiple physical network interfaces, each for a task. Ideally the physical network interface “eth0” will be used to provide the service, while another network interface “eth1” will be used to administer the server and where “keepalived” exchanges the status of both nodes.

Similarly, for simplicity, we used only one connection for each task. Environments where high availability is essential must have double network links using “bonding” or “teaming” with separate network cards; in the future I will post about it.

We have used a virtual network interface “dummy0” for flexibility, so we can disable it and continue operating on the server for maintenance. It also allows us to force the slave node take control disabling that interface in the master.

High Availability on Linux: The service IP address, part 1

In the previous post we saw what is high availability and what we address when a consumer tries to access a service: The closest possible availability time to 100%.

This entry describes what is the service IP address -entry point to them- and how to setup it using two servers.

To implement the things described under these lines is necessary to have two physical or virtual servers and a Linux distribution installed on them.

In order to avoid excessively long entries, I splitted this one into two parts. This first part is an introduction and preparation of systems, while the second shows how to setup the proper service IP address.

The service IP address

When we access to a service the connection is made to an IP address, in a direct way (192.0.2.100) or through a host name (www.example.com).

Suppose we want to access to a web page (http://www.example.com) and that its associated IP address is “192.0.2.100”. This IP address, through which the page is accessible, is called the “service IP address”.

The objective is to have this IP address always available. For consumers the perception should be like the following image:

To achieve it, having a minimum of two servers, one will have the service IP address assigned and the other will be waiting just in case the first one fails to take it.

If server “server-1” wasn’t functional, then server “server-2” would use the service IP address. The service will be degraded -one component in fail state- but operative and accesible for the consumer.

If no server is available the service cannot be given as the service IP address cannot be assigned to none of them.

The consumer can’t access the service. It perception will vary:

The method which allows a server to use a service IP address previously assigned to other one when it becomes unavailable is called “IP failover”.

Multiple standard protocols which implements IP failover methods exists, being VRRP the most famous of them and the one we will be using.

Required setup

Before we start we must have the necessary material prepared.

We will use two servers called “server-1” and “server-2”. I have used Debian 8, but this configuration is practically identical on other distributions like CentOS except where network configuration file path and software installation command varies.

Both servers are connected through “eth0” network interface to a switch and belongs to 192.0.2.0/24 subnet. The switch is connected to a router whose IP address is 192.0.2.1 and acts as a gateway.

The following diagram shows the equipment interconnection:

We will note both servers information for further reference:

System preparation

In Linux multiple VRRP protocol implementations exists. On these articles we are going to use “keepalived” for its extra functionality, but another valid option would be “vrrpd“.

The following steps must be done in both servers.

Depending on the choosen distribution  we may install “keepalived” with “apt” or “yum”:

We also need to load and configure the “dummy” kernel module on the server. It provides a virtual network interface where the service IP address will be assigned.

To load and make it permanent across server reboots we need to run:

To configure it and make it permanent across server reboots we need to run:

Once loaded we will verify that “dummy0″ interface is available:

We’ll leave the interface ready to use:

We need a last change on the operating system: We should allow processes (programs running in the background) to listen network requests on an IP address that doesn’t belong to them.

As the service IP address will be assigned to an unique server, the other one doesn’t have it, so it won’t allow programs to use it to listen for data. That implies,  once the service IP address changes from server, accessing to the server which has the address in such moment and run the necessary programs that, now successfully, can use such IP address.

This will limit us so much and we don’t met the “high availability” premise -The service IP address changes from one server to another, but programs won’t run without manual (human) intervention- so we will configure the system to allow processes to listen on any IP address, even if doesn’t belong to it.

At this point we will verify the following:

– Keepalived software is installed:

– The “dummy” module is loaded and configured:

– The “dummy0” interface exists and is ready to listen:

– The system has the net.ipv4.ip_nonlocal_bind variable set to 1:

Once everything has been checked, we can start configuring the service IP address, but this will be on the next entry on this serie.

High availability: Introduction

This entry is the first one of multiples I will write about high availability systems, being this one an introduction to necessary basic concepts to understand what is and why is it for. The articles will be focused on software high availability.

The entry will serve both newcomers or people managing computer services which aren’t technical and those introduced into the subject who wants to refresh their concepts or correct me on that he/she believes appropiate.

As we move forward on entries, they’ll become more technical and will require a deeper knowledge. Terminal usage on a Linux distribution, software installation and file editing will become essential requisites.

Along them some services will be configured on high availability, like web servers o database servers, in a way the concepts can be captured into something tangible and that can be implemented with a theoretical component.

After this introduction, lets start.

A computer services world

Today’s computing cannot be conceived without that so called “services”. Services are a mix of software and hardware running 100% of the time, permanently connected to a computer network and whose mission is information transformation and transmission.

If any of the software or hardware aren’t working, then the service cannot be used and we cannot get or transform the information we want.

A service will be very simple such as an unique program on an unique server (or domestic PC) or very complex to be formed with multiple computer programs and hardware. Very popular sites like Google or Facebook runs many programs on thousands of servers and other gear to serve their search, maps, photograpy, social network, etc. services.

When we talk about services we refer to all necessary things (software, hardware, etc.) that composes it to be used for.

Some software used to serve servicios will be “Apache HTTPD Server” which allows to serve web pages with an unique program, or “Postfix” which serves email and uses multiple programs for it. They both can be run on an unique server or domestic PC.

What is availability?

Before we can answer what is high availability, we must know what is availability.

This term, on compute world, refers than an existing service is accesible. A typical service will be a web server, where there are pages stored and accessed from web browsers like this blog.

When an user through his/her web browser tries to visit an existing web site -we correctly wrote it- and it doesn’t load, we say it’s “unavailable” and we cannot view it.

Therefore, “availability” is what allows an existing service being accessed and used without inconvenience.

Availability time

Availability can be measured with a yearly percentage, in minutes. From 0% (0 minutes) to 100% (525600 minutes, approximately). An 1% corresponds to 5260 minutes or 87.6 hours or 3.65 days.

This percentage is called “availability time” or “(service) uptime”. And the time the service hasn’t been available is called “downtime”.

In some services it’s calculated from the first day of the year and on others it’s calculated in annual periods since its activation day.

What is high availability?

We can define the high availability as the used technique to provide uninterrupted service, allowing one or more elements (software, servers, network gear, etc.) of such service to be on failure state without impact on the operation or it is not noticeable to the end user.

High availability is achieved through redundancy of the elements of the service: network gear (switches, routers), servers and server components (two or more hard disks, two or more ECC type RAM memory, two or more processors, two or more power supplies) connected to two or more electrical sockets on two or more power strips -each one from an independent electrical supplier-, etc. And most importantly, the programs that provide the service are ready in all that redundant infrastructure to operate.

There is also an article dedicated to high availability in Wikipedia.

Why is it high availability for?

From an user point of view, thanks to high availability we can access a service at any time and put it to use.

For a professional or business, the high availability translates to a better professional image. What will happen if our users cannot access to our website? They cannot see it. And, what will happen if, well, we are hosting providers and let many users with their websites unavailable? Many users that have placed their trust in our business will see how over a time interval their websites aren’t accesibles with the harm that will cause them; and to the business if it must compensate economically for such unavailability.

It can be worse: Lets imagine a bank service responsible of moving money between entities stops working; or the service which gives employee payroll; or the service information exchange between medical consultations.

High availability allows services to be always available. To a user lets enjoy them, and a professional or business ensure their services can be always enjoyed the maximum possible time, trying to be very close to 100%.

 

I have a blog! (one more time)

It has rained much since the 26th of September, 2007. That day was the last I wrote on my old blog, with one exception in March, 2009.

I have been thinking around resume my blog since shortly after. Or to start, as in some attempts the closest I was, was in 2007. I used blogger, bitacoras.com, my own domain (blog.daijo.org) with Drupal and WordPress and finally for some reason or another: none.

8 years later is a good moment for it. I’m now over thirty and raise I should do what I want to do and have not done without justified reason more than my own me fighting against myself to put me on it or not.

Today, 19th of July of 2016, with 40ºC of pleasant freshness in Ciudad Real, I retake my blog. I have not set any publication rate, each entry will appear when I consider it is ready to be published and discussed.

Yesteryear, the blog had a title (or subtitle, depending of the time) “The cauldron of topics”. Like then, I will mix various topics: computer engineering, both views and personal opinions, team management, how computer engineering companies should be today as “I think”, etc. I am a classic mixing apples with oranges.

This time I’ll try that entries appears in both Spanish and English, as time allows me and for what I’ll use the WPML plugin. I have enhance my written English and the use of expressions in the right way as an objective, and I will be grateful to everyone who spot both expression or writing errors.

I was tempted to reinsert the old blog entries I had on blog.daijo.org, however, they are indexed thanks to the wayback machine from archive.org. The links are as follows:

– Between December, 2005 and May, 2006 (using Drupal).

– Between December, 2006 and October, 2007 (using WordPress)

I should not forget that to comply with Spanish legislation, I adapted the “Legal Notice and Privacy Policy” template that “Ciudadano 2.0” web offers in its free downloads zone for its readers. Thanks to its author, Raquel Rubín, for sharing.