Improving DHCP Performance In OpenStack
Have you ever seen a problem in OpenStack where a VM loses its IP address? If you have, you know what a problem it can be -- especially if you have a large number of nodes and VMs. Your clients get frustrated as they start losing connectivity with their VMs for no obvious reason. Even the cloud support team gets frustrated, as everything appears to be working with no hints in the log files as to what might be wrong.Sound familiar?
In this blog post, I would like to share my experience with OpenStack networking, and specifically the DHCP subcomponent that is responsible for allocating an IP address to a VM.Why are we blaming it on the DHCP component? Because this particular issue is commonly caused by this small, seemingly trivial OpenStack component.
DHCP agent and DNSmasq
In Openstack, neutron-dhcp-agent provides instances with IP addresses. Theoretically, neutron-dhcp-agent could support different types of backends, but for now it supports only dnsmasq. When an instance is spawned, the procedure of allocations and assignment includes a process that involves storing an IP address in the dnsmasq config, then starting or reloading dnsmasq. Usually Openstack has only one neutron-dhcp-agent, which will spawn one dnsmasq per network, so one big network (including all the subnets in it) will be served by only one dnsmasq service. Theoretically -- and according to practical lab testing -- dnsmasq should be able to serve up to 1000 DHCP requests per second, but here are some facts:
The lease time, by default, is 120 seconds. As you probably know, the dhcp client will try to prolong the lease halfway through the lease time. That means that each and every VM will update their IP addresses once a minute.
Almost four minutes are required (3 min 43 seconds) to start one DNSmasq instance with 65535 static leases. Usually this happens when Neutron allocates a new IP for a new VM, then forces DNSmasq to reload. During this time, no DHCP service will be provided for the corresponding private Neutron network.
If you’re not using no-ping option for dnsmasq configuration -- the default for OpenStack due to safety concerns -- you’ll suffer from very slow service speed, because in dnsmasq, a separate process pinger is used to check that the offered IP address isn't already in use. With the “no-ping” option, dnsmasq was able to serve about 160 requests per second during 10 minutes without losing any of them, though this performance is dependant on core speed and CPU speed.
Ubuntu and CentOS have mac address tables (neighbour table) limited to 128/512/1024 (net.ipv4.neigh.default.gc_thresh1/2/3) records. Because of this, IP records that are not frequently used will age abnormally fast, and that will affect networking performance and slow the ability for the system to know how to send traffic to correct mac address on the node on which the dhcp agent resides.
Attempting to work around these performance problems by significantly increasing IP lease time will cause a huge problem with respect to the release of IP addresses by neutron if your cloud loads dynamically change. By default, neutron will allocate an IP address to a VM for 24 hours, independent of the actual lease time. Also, by default, neutron will not release an IP address until 24 hours after an instance has been terminated.
Actions you can take
Fortunately, there are some things you can do. If you’re using OpenStack with private networks that have an address space of more than 255 addresses (/24), then you should consider tuning default parameters for dnsmasq and for the network node itself.
Increase IP lease time to decrease the number of requests per second coming from VMs trying to renew IP addresses. Calculate the new lease time based on common sense, keeping in mind the average VM lifecycle time. Setting lease time to very large values will force OpenStack to keep this IP in the database as "used" because of a bug. Neutron will not release the IP because of neutron’s lease time in the database, even if the VM is deleted.
Increase the size of the MAC address table to be capable of serving at least 1k hosts. To do that, you typically need to set the sysclt variables (usually in /etc/sysctl.conf) on the host where dhcp_agent is located. Optionally you can do this on all networking-related nodes. The variables and their settings are:
net.ipv4.neigh.default.gc_thresh1 = 1024
net.ipv4.neigh.default.gc_thresh2 = 4096
net.ipv4.neigh.default.gc_thresh3 = 8192
Add the no-ping option to the default parameters for dnsmasq. This change will enable it to serve more than 10-20 requests per second because dnsmasq won't try to ping these IP’s before actually allocating them. Note that you should be very careful with this option if you’re using OpenStack as part of your infrastructure. For example, if you’re using provider networks and your VM’s are part of single L2 domain with the other physical servers/equipment/etc. IP conflicts are possible and can wreak havoc.
Changes the Neutron community should think about
Unfortunately, there is no way for a user to solve the problem of 24 hour IP allocation in neutron. Instead, it must be solved by changes to neutron. The simple solution would be to have a configurable parameter in neutron or dhcp-agent for the lease time, and use it as the allocation period for the neutron database. This way looks perfect on the surface but on closer inspection you realize it will significantly increase load on neutron-api/neutron-db. So this is not the correct incorrect way to solve the problem.
Instead, neutron should simply remove IP’s from the database on instance termination. This will solve all the problems with dynamic workload on a cloud and allow the flawless reuse of IP addresses. [UPDATE: In fact, that is exactly the situation as of the OpenStack Icehouse release, where this problem has now been mitigated somewhat.]
As I promised, I covered only one small subsystem of OpenStack networking, the DHCP service. As you can see, it can cause a lot of pain if it is configured incorrectly, and especially if you use default values for DNSmasq options. The recommendations I provided above may help you to understand how to select specific DNSmasq options and how to tune them if necessary.