Mirantis | The #1 Pure Play OpenStack Company

Using Software Load Balancing in High Availability (HA) for OpenStack Cloud API Services

In two previous posts, my colleagues Oleg Gelbukh and Piotr Siwczak covered most of the ground on making OpenStack highly available (HA). If you haven’t read those posts yet, it’s a good time to do so:

In this post, I’ll offer some direct practical advice on completing the puzzle: enabling HA load balancing for OpenStack API services.

All of these services are stateless, so putting a highly available load balancer on top of a few instances of the services is enough for most purposes. Here I’ll consider one option: HAProxy + Keepalived; however, other options are possible (e.g., HAProxy + Pacemaker + Corosync, or a hardware load balancer).

Which services?

As discussed in previous posts, we’re talking about OpenStack REST API services: nova-api, keystone-api, and glance-api.

Types of failures

Several kinds of outages can happen in a distributed system like OpenStack. Here are the major ones:

  • Service instance failure: A particular instance of a service crashes, but the other processes on the same machine function normally.
  • Machine failure: A whole machine becomes unusable, perhaps due to a power outage or network failure.
  • Network partition: Several network segments become unable to talk to each other (e.g., a higher-level switch malfunctions) but can talk to everybody within their segment. This poses a huge problem for any stateful services and is the root cause of the sheer complexity of different consistency models for distributed data stores (how to synchronize the changes made on different partitions when connectivity is restored). However, for stateless services, it’s simply equivalent to machines on different ends of the partition becoming unavailable to each other.

There are also more complex kinds of failures, such as when a service hangs or starts giving the wrong answers due to hardware failure or other problems.

Some of these failures can be mitigated by monitoring at the application level; for example, by checking that the service gives an expected answer to a sample request, and restarting the service otherwise. Some failures, alas, cannot.

NOTE: In this post, I’m only covering service/machine crashes.

Surviving service failures

Suppose we have an external load balancer that we assume to be always available. Then we spawn several instances of the necessary services and balance them.

 

 

 

 

Whenever a request arrives, the balancer will attempt to proxy the request for us and connect to one of the backend servers. If a connection cannot be established, the load balancer will transparently try another instance.

Note that this provides no protection against failure of a service in the middle of executing a request. More on this later.

Surviving load balancer failures

So what if the load balancer itself fails? As a load balancer is almost stateless (except for stickiness, which we can ignore in OpenStack), we just need to put a virtual IP address on top of a bunch of load balancers (two is often enough). This can be done using Keepalived or other similar software.

This construction makes the virtual IP address refer to “whichever balancer is available.” The details, in the case of Keepalived, are implemented using the VRRP protocol. When a node owning the virtual IP dies, the other Keepalived will notice and assign the IP address to itself.

What if the loadbalancer software crashes but the node survives (very unlikely but possible)? For that, Keepalived has “check scripts”—just configure Keepalived to use a script that checks if the load balancer is running, and whenever it’s not, Keepalived considers the node unusable and moves the virtual IP to a usable node.

What if Keepalived crashes? The other Keepalived will think the whole node died and claim the virtual IP.

NOTE: VRRP, and thus Keepalived, leaves a short window of unavailability during failover.

Transitional failure effects

There are at least three levels of fault tolerance with very different guarantees and different implementation complexity:

  • Level 1: Failure of a component does not lead to permanent disruption of service.
  • Level 2: Failure of a component does not lead to failure of new requests.
  • Level 3: Failure of a component does not lead to failure of any requests (new orcurrently executing).

At level 1, there may be a window of unavailability (the shorter, the better), e.g., until we detect that a particular server became unusable and tell the client to use another one instead.

At level 2, no new requests are denied, though currently executing requests may fail. This is harder: We must be able to direct any request to a currently available instance, which requires the infrastructure to proxy, and not just redirect the connections. Here we assume that if a connection can be established, the server will not die while serving the request—this is equivalent to assuming that requests take zero time, otherwise it’s equivalent to level 3.

At level 3, the failure recovery happens transparently even to someone who’s executing a long request with a server that now failed.

Level 3 is usually impossible to implement fully at an infrastructure level because it requires 1) buffering requests and responses and 2) understanding how to safely retry each type of request.

For example: What if a large file upload or download fails? Should the infrastructure buffer the whole file and re-upload/re-serve it? What if a call with side effects failed, having perhaps performed half of them—is it safe to retry it?

Implementing this level of fault tolerance requires a layer of application-specific retry logic on the client and special support for avoiding duplicate side effects on the server.

The setup I’m discussing in this post gives level 2 protection against service failures and level 1 for balancer failures.

NOTE: Previous posts on MySQL and RabbitMQ HA introduced level 3 tolerance to their failures as there is retry logic in place, at least if you use the proper patches mentioned in the posts.

Software topology

As mentioned, we’ll use the following set of software:

  • the services themselves;
  • HAProxy for making the services HA; and
  • Keepalived for making HAProxy HA.

We’ll have two types of nodes: a service node and an endpoint node. A service node hosts services, while an endpoint node hosts HAProxy and Keepalived. A node can also play both roles at the same time.

Wiring of services to each other

Services must address each other by the virtual IP in order to take advantage of each other’s high availability.

Also, if higher than level 1 or 2 fault tolerance is needed, application-specific retry logic should be introduced. As far as I know, nobody currently retries internal calls in OpenStack (just calls to Keystone), and it seems that in most cases it’s enough to retry external ones.

Getting hands-on

Enough theory, let’s build the thing.

Suppose we have two machines, from which we want to make an HA OpenStack controller pair, installing both the API services and Keepalived + HAProxy.

Suppose machine 1 has address 192.168.56.200, machine 2 has address 192.168.56.201, and we want the services to be accessible through virtual IP 192.168.56.210. And suppose all these IPs are on eth1.

This is how everything is wired (it’s similar for other services; HAProxy and Keepalived are shared,of course).

 

 

 

 

Installing necessary packages

I’m assuming that you’re on Ubuntu, in which case you should type:

Configuration of haproxy

This configuration is identical on both nodes and resides in /etc/haproxy/haproxy.cfg.

This configuration encompasses the four nova-api services (EC2, volume, compute, metadata), glance-api, and the two keystone-api services (regular and admin API). If you have something else running (e.g., swift proxy), you know what to do.

For more information, you can read a HAProxy manual.

Now restart HAProxy on both nodes:

Configuration of Keepalived

This configuration is almost, but not quite identical on both nodes as well, and resides in /etc/keepalived/keepalived.conf.

The difference is that one node has its priority defined as 101, and the other as 100. Whichever of the available nodes has highest priority at any given moment, wins (that is, claims the virtual IP).

For more information, you can read a Keepalived manual.

And now let us check that Keepalived + HAProxy work by poking glance:

Also, we can see that just one of the controllers—the one with higher “priority”—claimed the virtual IP:

Configuration of OpenStack services

Now for the service wiring. We need two things:

1) to listen on the proper local IP address and
2) address others by the virtual IP address.

Nova

Keystone

Glance

Openrc file

Now, assuming you have OpenStack running, you can try doing something with your HA setup.

Conclusion

So, we can deploy an almost1 fully HA OpenStack by combining the contents of this post and the two previous posts (OpenStack HA in general and MySQL and RabbitMQ HA).

That was easy! You can thank the modular design of OpenStack, but perhaps the most credit should be given to the fact that components are wired via asynchronous messaging (RabbitMQ), whose main purpose is helping to build fault-tolerant systems.

1 Why almost? Because we have a short unavailability window during failover of a load balancer, and because currently executing requests will break during failover. This can be mitigated by retrying requests (client-side and by developing a patch for Nova to retry internal Keystone requests), but it gets a lot more difficult if the requests can have side effects.

5 comments
Google Plus Mirantis

5 Responses

  1. Jeroen van Bemmel

    Have you considered to use keepalived also for load balancing, instead of HAProxy? See http://gcharriere.com/blog/?p=339

    September 4, 2012 22:29
    • Eugene Kirpichev

      We have considered that, and I think I even read this particular post, but we decided to use haproxy after all, because it was much easier to configure (keepalived failed to work “out of the box” in the LB configuration).

      September 7, 2012 17:23
  2. Baktha

    Hi Eugene Kirpichev,

    Great Post, Been looking for HA and your post is really helpful.
    Just curious, Could you please tell me how to setup HA proxy with Pacemaker?

    January 22, 2013 06:26
  3. fifi

    What happens if haproxy process goes down, in this config. keepalived will be clueless, possibly leaving the virtual ip in the server where the unavailable haproxy resides.

    September 12, 2013 03:12

Continuing the Discussion

  1. OpenStack Community Weekly Newsletter (Aug 31-Sep 7) » The OpenStack Blog

    [...] By Mirantis: Using Software Load Balancing in High Availability (HA) for OpenStack Cloud API Services [...]

    September 7, 201215:00

Some HTML is OK


or, reply to this post via trackback.