Mirantis | The #1 Pure Play OpenStack Company

OpenStack Networking Tutorial: Single-host FlatDHCPManager

In a previous post, my colleague Piotr Siwczak explained how FlatManager and FlatDHCPManager works in a multi-host network setup.

Here, I will explain how FlatDHCPManager works with single-host networking. It is perhaps easier to understand; it also happens to be the default mode you get when installing OpenStack using one of the easy ways (by using the Puppet recipes). (I will not consider FlatManager, as it is not very widely used).

Table of contents

General idea

With single-host FlatDHCP, there’s just one instance of nova-network; dnsmasq, typically running on the controller node, is shared by all the VMs.

Contrast that to multi-host FlatDHCP networking, where each compute node also hosts its own instance of the nova-network service, which provides the DHCP server (dnsmasq) and default gateway for the VMs on that node.

In this setup, the br100 interface and an associated physical interface eth2 on the compute nodes don’t have an assigned IP address at all; they merely serve as an L2 interconnect that allows the VMs to reach nova-network and each other. Nova-network essentially functions as an L2 switch.

VM virtual interfaces are attached to br100 as well. The VMs have their default gateway set (in the guest OS configuration) to 10.0.0.1, which means that all external traffic from VMs is routed through the controller node. Traffic within 10.0.0.0/24 is not routed through the controller, however.

Network Configuration

Let us consider an actual example:

  • 1 controller node
  • 2 compute nodes,
  • eth1 hosting the management network (the one through which compute nodes can communicate with the controller and nova services)
  • eth2 hosting the VM network (the one to which VMs will be attached).

We’ll start with a look at all aspects of the network configuration on the controller and one of the compute nodes: before and after starting a VM.

Controller node, no VMs

The controller’s network configuration looks like this (it changes very little when VMs are spawned):

Interfaces:

NOTE:
eth2 is configured to use promiscuous mode! This is extremely important.
It is configured in the same way on the compute nodes. Promiscuous mode allows the interface to receive packets not targeted to this interface’s MAC address. Packets for VMs will be traveling through eth2, but their target MAC will be that of the VMs, not of eth2, so to let them in, we must use promiscuous mode.

Bridges:

Routes:

Dnsmasq running:

Nova configuration file:

Dnsmasq configuration file:

eth1 is the management network interface (controlled by --public_interface). The controller has address 192.168.56.200, and we have a default gateway on 192.168.56.101.

eth2 is the VM network interface (controlled by --flat_interface). As said, it functions basically as an L2 switch; it doesn’t even have an IP address assigned. It is bridged with br100 (controlled by --flat_network_bridge).

br100 usually doesn’t have any IP address assigned as well, but on the controller node it has dnsmasq listening on 10.0.0.1 (it is the DHCP server spawned by nova and used by VMs to get an IP address) because it’s the beginning of the flat network range (--fixed_range).

The dnsmasq config (/var/lib/nova/networks/nova-br100.conf) is empty so far, because there are no VMs. Do not fear the two dnsmasq processes – they’re a parent and a child, and only the child is doing actual work.

The interfaces eth1 and eth2 existed and were configured in this way before we installed OpenStack. OpenStack didn’t take part in their configuration (though if eth2 had an assigned IP address, it would be moved to br100 – I’m not sure why that is needed).

However, interface br100 was created by nova-network on startup (the code is in /usr/lib/python2.7/dist-packages/nova/network/linux_net.py, method ensure_bridge; it is called from the initialization code of nova/network/l3.py – the L3 network driver; look for the words L3 and bridge in /var/log/nova/nova-network.log).

NOTE:
In fact, I think that on the controller node we could do just as well without br100, directly attaching dnsmasq to eth2. However, on compute nodes br100 also bridges with VM virtual interfaces vnetX, so probably the controller is configured similarly for the sake of uniformity.

Let us also look at iptables on the controller (nova only ever touches the filter and nat tables, so we’re not showing raw):

Basically, this means that incoming DHCP traffic on br100 is accepted, and forwarded traffic to/from br100 is accepted. Also, traffic to the nova API endpoint is accepted too. Other chains are empty.

There are also some rules in the nat table:

These rules will become more important in the coming posts on floating IPs and granting VMs access to outside world (they are responsible for masquerading traffic from the VMs as if it originated on the controller, etc.), but currently the only important rule is this one: -A nova-network-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.56.200:8775. It makes the nova metadata service “listen” on the link-local address 169.254.169.254 by doing DNAT from that address to its actual bind address on the controller, 192.168.56.200:8775.

Compute node, no VMs

Interfaces:

Note that eth2 is configured to use promiscuous mode, just as on the controller!

Uninteresting stuff:

Routes:

We see that the compute node, just as the controller node, has two interfaces: eth1 for the management network (192.168.56.202, routed through the external DHCP server 192.168.56.101) and eth2 for the VM network (no IP address). It doesn’t have a bridge interface yet, because nova-network is not running here and no VMs have been started, so the “L3 driver”, mentioned before, has not been initialized yet.

All of this configuration was also done before installing openstack.

The peculiar thing about the compute node is that there’s an entry for 169.254.0.0/16 – that’s for the nova metadata service (which is part of nova-api and is running on the controller node, “listening”, with the help of an iptables rule, on 169.254.169.254, while actually listening on 192.168.56.200:8775). The 169.254.x.x subnet is reserved in the IPv4 protocol for link-local addresses. This entry is present here to avoid routing traffic to the metadata service through the default gateway (as link-local traffic must not be routed at all, only switched).

Starting a VM

Now let’s fire up a VM!

So, the VM was allocated IP address 10.0.0.2 (the next section will explain how it happened), booted and is pingable from the controller (interestingly, it is not supposed to be pingable from the compute node in this network mode).

NOTE:
(from Piotr Siwczak): From the controller node running in single-host mode you will be always able to ping all instances as it acts as a default gateway to them (br100 on the controller has address 10.0.0.1). And by default all traffic to VMs in the same network is allowed (by iptables) unless you set --allow_same_net_traffic=false in /etc/nova/nova.conf. In this case only traffic from 10.0.0.1 will be allowed.

Now let us see how the configuration of the controller and compute node have changed.

Controller node, VM created

When nova-network was creating this instance, it chose an IP address for it from the pool of free fixed IP addresses (network configuration of an instance is done in nova/network/manager.py, method allocate_for_instance). The first available IP turned out to be 10.0.0.2 (availability of fixed and floating IPs is stored in the Nova database). Then, dnsmasq was instructed to assign the VM’s MAC address with the IP address 10.0.0.2.

While booting, the VM got an IP address from dnsmasq via DHCP, as is reflected in syslog:

All other things didn’t change (iptables, routes etc.).

Thus: creating an instance only affects the controller’s dnsmasq configuration.

Compute node, VM created

The compute node configuration changed more substantially when the VM was created.

The vnet0 interface appeared. This is the virtual network interface for the VM. Its MAC address was initialized from /var/lib/nova/instances/instance-XXXXXXXX/libvirt.xml. You can take a look at /var/lib/nova/instances/instance-XXXXXXXX/console.log to see how the VM is behaving. If you see some network errors, that’s a bad sign. In our case, all is fine:

So, the instance also thinks it’s gotten the IP address 10.0.0.2 and default gateway 10.0.0.1 (also as a DNS server). It also attempted to download a “user data” script from the metadata service at 169.254.169.254, succeeded, then tried to download the public key for the instance, but we didn’t assign any (thus HTTP 404), so a new keypair was generated.

Some more changes happened to iptables:

As the network driver was initialized, we got a bunch of basic compute node rules (all except for those referencing nova-compute-inst-9).

When the instance’s network was initialized, we got rules saying that traffic directed to the VM at 10.0.0.2 is processed through chain nova-compute-inst-9 – accept incoming DHCP traffic and all incoming traffic from the VM subnet, drop everything else. Such a chain is created per every instance (VM).

NOTE:
In this case the separate rule for DHCP traffic is not really needed – it would be accepted anyway by the rule allowing incoming traffic from 10.0.0.0/24. However, this would not be the case if allow_same_net_traffic were false, so this rule is needed to make sure DHCP traffic is allowed no matter what.

Also, some network filtering is being done by libvirt itself, e.g. protection against ARP spoofing etc. We won’t focus on these filters in this document (mostly because so far I’ve never needed them to resolve a problem), but in case you’re interested, look for filterref in the instance’s libvirt.xml file (/var/lib/nova/instances/instance-XXXXXXXX/libvirt.xml) and use the commands sudo virsh nwfilter-list, sudo virsh nwfilter-dumpxml to view the contents of the filters. The filters are established by code in nova/virt/libvirt/connection.py and firewall.py. Their configuration resides in /etc/libvirt/nwfilter.

VM guest OS network configuration

Now let us see how the network configuration looks on the VM side.

We see that the VM has its eth0 interface assigned IP address 10.0.0.2, and that it uses 10.0.0.1 as the default gateway for everything except 10.0.0.0/24.

That is all for the network configuration, congratulations if you read all the way to here! Now let us consider the packet flow.

Understanding Packet Flow

In this section we’ll examine how (and why) packets flow to and from VMs in this example setup. We’ll consider several scenarios: how the VM gets an IP address, ping VM from controller, ping VM from compute node, ping VM from another VM.

How packets flow at L2 level

We need to first consider how packets are routed in our network at the lowest level – namely ethernet packets, addressing devices by their MAC address – because we’ll need this to understand how they flow at a higher level.

Thankfully, this is simple.

All machines (all compute nodes and the controller) are connected via the physical network fabric attached to their eth2 interface (remember that we have two physical networks in this setup: eth1 for the management network and eth2 for the VM network, and for security we keep them physically separate). They all have the br100 bridge connected to eth2. A bridge is essentially a virtual L2 switch.

Compute nodes also have the VM virtual interfaces vnetX bridged to br100.

So, ethernet broadcast packets reach all the machines’ eth2 and br100, as well as all the VM’s vnetX (and consequently all the guest OS interfaces).

Ethernet unicast packets flow in a similar fashion through physical switches forming this network and through the virtual switches implemented by br100 (read How LAN switches work and Linux bridge docs; further details aren’t important in this context).

How packets flow at L3 level

While L2 ethernet packets address devices by their MAC address, the L3 level is all about IP packets, whose endpoints are IP addresses.

To send an IP packet to address X, one finds the MAC address corresponding to X via ARP (Address Resolution Protocol) and sends an L2 packet to this MAC address.

ARP works like this: we send an L2 broadcast packet “who has IP address X?” and whoever has it, will respond to our MAC address via L2 unicast: “Hey, that’s me, my MAC is Y”. This information will be cached in the OS “ARP cache” for a while to avoid doing costly ARP lookups for each and every IP packet (you can always view the cache by typing arp -n).

When we instruct the OS to send a packet to a particular IP address, the OS also needs to determine:

  • Through which device to send it – this is done by consulting the routing table (type route -n). E.g. if there’s an entry 10.0.0.0 / 0.0.0.0 / 255.255.255.0 / br100, then a packet to 10.0.0.1 will go through br100.
  • What source IP address to specify. This is usually the default IP address assigned to the device through which our packet is being routed. If this device doesn’t have an IP assigned, the OS will take an IP from one of the other devices. For more details, see Source address selection in the Linux IP networking guide.

It is very important here that eth2 (the VM network) interface is in promiscuous mode, as described previously. This allows it to receive ethernet packets and forward them to the VM interfaces even though the target address of the packets is not the eth2 MAC address.

Now we’re ready to understand the higher-level packet flows.

How the VM gets an IP address

Let us look in detail what was happening when the VM was booting and getting an IP address from dnsmasq via DHCP.

DHCP works like this:

  • You send a DHCPDISCOVER packet to find a DHCP server on your local network;
  • A server replies with DHCPOFFER and gives you their IP address and suggests an IP address for you;
  • If you like the address, you send DHCPREQUEST;
  • The server replies with DHCPACK, confirming your right to assign yourself this IP address;
  • Your OS receives the DHCPACK and assigns this IP address to the interface.

So, when the VM is booting, it sends a DHCPDISCOVER UDP broadcast packet via the guest OS’s eth0, which is connected by libvirt to the host machine’s vnet0. This packet reaches the controller node and consequently our DHCP server dnsmasq which listens on br100, etc.

I won’t show the tcpdump here; there’s an example in the next section.

Ping VM from controller

Let us look in detail at how it happens that ping 10.0.0.2 succeeded when ran on the controller (remember that 10.0.0.2 is the IP address assigned by Openstack to the VM we booted).

What happens when we type ping 10.0.0.2? We send a bunch of ICMP packets and wait for them to return. So:

  • We consult the routing table (route -n) and find the entry “10.0.0.0 / 0.0.0.0 / 255.255.255.0 / br100″ which says that packets to 10.0.0.x should be sent via the br100 interface. This also means that the return address will be 10.0.0.1 (as it’s the IP assigned to br100 on the controller).
  • We send an ARP broadcast request “who has IP 10.0.0.2? Tell 10.0.0.1″ through br100 (note that I had to manually delete the ARP cache entry with “arp -d 10.0.0.2″ to be able to demonstrate this, because it was already cached after the DHCP exchange mentioned in the previous section. Actually this ARP exchange already happened during that prior DHCP exchange, and it only happened again because I forced it to do so.):
  • This ARP packet gets sent through br100, which is bridged with eth2 – so it is sent to eth2, from where it is physically broadcast to all compute nodes on the same network. In our case there’s two compute nodes.
  • The first node (compute-1) receives the ARP packet on eth2, and, as it is bridged to br100 together with vnet0, the packet reaches the VM. Note that this does not involve iptables on the compute node, as ARP packets are L2 and iptables operate above that, on L3.
  • The VM’s OS kernel sees the ARP packet “who has 10.0.0.2?” and replies to 10.0.0.1 with an ARP reply packet, “That’s me!”It already knows the MAC address of 10.0.0.1 because it’s specified in the ARP packet. This ARP reply packet is sent through the guest side of vnet0, gets bridged to the host side, then to br100 and, via eth2, lands on controller. We can see that in the tcpdump:
  • In fact, compute-2 also receives the ARP packet, but since there’s no one (including VMs) to answer an ARP request for 10.0.0.2 there, it doesn’t play any role in this interaction. Nevertheless, you would see an identical ARP request in a tcpdump on compute-2.
  • Now we know the VM’s mac address. Now we send an ICMP echo-request packet:

    This successfully reaches the VM as a result of the following sequence of iptables rules firing on the compute node:
  • This packet reaches the VM as already described, and the VM’s OS replies with an ICMP echo-reply packet:

    As the VM’s routing table includes 10.0.0.0/24 via 0.0.0.0 dev eth0, the packet does not get routed through the controller and is instead sent “as is” to VM’s eth0 and routed the usual way.

    This concludes the roundtrip and we get a nice message:

We would see similar tcpdumps if we did tcpdump -i eth2 or tcpdump -i br100 or tcpdump -i vnet0 on the compute node. They could differ only if something went wrong, which would be a good reason to check the routes and iptables and understand why, for example, a packet that exited the controller’s eth2 didn’t enter the compute node’s eth2.

Whew!

Ping VM from compute node

Don’t worry, this won’t be as long, because the ping will fail:

This message means that the compute node is not allowed to send ICMP packets (prohibited by iptables). In this case, the packet travels the following path along iptables:

What happened is that the ACCEPT rule from nova-compute-inst-9 didn’t fire, because the packet was going to be routed through the default gateway at eth1 and its source address was the one on eth1, 192.168.56.202, which is not inside 10.0.0.0/24.

Since the packet got dropped by iptables as described, it was never physically sent over the wire via eth1.

Ping VM from VM

This won’t be long too, as we already saw how L3 packet routing works from/to VMs.

Basically, when one VM pings another, it uses similar L2 broadcasts to know its MAC address, and similar L3 packet flow to route the ping request and reply. Even the sequence of iptables rules allowing this interaction will be the same as in the “ping from controller” case.

Ping outer world from VM

If we try to ping something outside 10.0.0.0/24 from a VM, this traffic will be routed through the VM’s default gateway, 10.0.0.1, which is on the controller. However, in the current setup no such ping will ever succeed, as VMs are only allowed to communicate with 10.0.0.0/24 (though if you try, you will see the packets in tcpdump on the controller).

Giving VMs access to the outside world is a topic for a subsequent post.

Troubleshooting

In this section I’ll give general troubleshooting advice and pointers to useful tools. First, let’s say you cannot ping your instance from the controller (as said, believe it or not, you shouldn’t be able to ping it from the compute node).

Don’t worry and don’t dig too deep, check the most obvious things first: imagine what you’d do if you were a lot less skilled with network debugging tools. In 90% of the cases, the problem is something stupid.

First, gather all the information you can before making any changes.

When you do make changes:

  • If you’re debugging on a VM, do a snapshot so you can start from scratch if you mess up the network configuration.
  • Avoid trying to “fix” something that’s supposed to have been set up by the system (the problem is likely to be caused by incorrect input to the system, e.g. wrong config files, not by its incorrect output).
  • Avoid irreversible actions.
  • Record all your actions. This increases the chances that you’ll be able to revert them.

First of all, check if the instance has actually booted properly. Do not rely on indirect signs (like the status being ACTIVE in nova list); go and VNC or virsh console to it. Or, if you cannot do that, find which compute node is supposed to have started it, go there and look at virsh list; then at the instance’s console.log (it’s supposed to have something meaningful inside); do a virsh screenshot; copy it to somewhere where you can view it; and then see what’s going on! Perhaps you uploaded a wrong image or misconfigured virtualization, etc. This might seem like the most obvious thing, but yours truly has been through this tale of sorrows more than once.

If the instance didn’t boot well or at all, check nova-compute and nova-network services; perhaps they aren’t feeling well either – check their logs (typically in /var/log/nova) for errors, exceptions or signs of hangs. Try restarting them. Check if dnsmasq is running. If something’s not running, try running it by hand (under sudo of course) and see when/why it crashes or hangs. Sometimes strace may help debug permission or connection errors. It can be used to attach to a running process too (strace -p).

Then use tcpdump to see how far your packets go and to isolate a place where they disappear. Can your ICMP request packets escape the source node? Do they reach the destination physical node? Do they get lost somewhere on the way between the node’s physical interface and the VM’s virtual interface? (perhaps ICMP traffic is prohibited by the security group) Does the VM emit reply packets? And so on. Do tcpdump of the VM network interface (perhaps also bridge interface and VM virtual interface vnetX) both on the controller and the compute node. Examples of successful tcpdumps were given in the previous section.

Once you isolate a place where packets disappear, check why they disappear.

  • Are they routed properly? (check ifconfig, route -n)
  • Are they dropped by iptables? (check iptables -S; if you feel adventurous, consult the iptables flowchart and use iptables tracing which will show you how exactly the iptables rules fire – don’t forget to disable it when you’re done).

You can use the following tools to inspect the network configuration:

  • ifconfig and ip addr to show network interface configuration.
  • arp -n to show the ARP cache, which usually allows you to understand whether the L2/L3 packet flow is working as expected.
  • route -n to inspect the routes.
  • brctl to inspect the bridge configuration.
  • tcpdump to intercept packets through various interfaces.
  • iptables to show the iptables configuration and to add logging/tracing with the LOG and TRACE targets.
  • virsh to inspect all aspects of a VM – from its configuration to its current state.

What’s next

In further posts we’ll cover the remaining network modes (VLAN networking) and floating IPs. Also, the post on multi-host FlatDHCP networking will be continued with more detail.

Conclusion

This post gave a very detailed (perhaps overly so) overview of the processes happening at all levels, allowing the OpenStack network to operate. My intention was not only to explain this particular network mode, but also to equip readers with a working understanding of low-level network operation details and with a toolset to use in their own debugging scenarios, as these are issues that I struggled with a lot when doing my first steps in OpenStack network debugging.

I would like to thank Piotr Siwczak for his extremely valuable comments and corrections.

64 comments
Google Plus Mirantis

64 Responses

  1. Dror

    Hi,

    First of all, thank you for the excellent post.

    And now the questions:
    In the section about “Compute node, no VMs” there is this line in the routes table –
    192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0

    It is unclear to me what this IP is about (192.168.122.0) and what is virbr0.

    If you can elaborate on that it would be great.

    Thanks!

    August 10, 2012 23:17
    • Eugene Kirpichev

      Hi!

      virbr0 is a virtual interface established by libvirt, but OpenStack doesn’t use it in any way; OpenStack manages networks in its own way that I just described. I don’t really remember what virbr0 is *supposed* to be used for – I’d recommend to read libvirt docs to understand that.

      192.168.122.0/24 is the libvirt network, again not used by OpenStack. You can read more here: http://www.redhat.com/archives/libvir-list/2007-February/msg00208.html

      Btw, if you google “192.168.122.0″ you’ll find that you’re not the only person who wonders why this particular subnet was chosen :)

      August 14, 2012 12:36
  2. GM3D

    Great post. Since the official install/deploy manual and starter guide are rather inconsistent with the usage of their IP address settings, clear explanation like this post really hits the spot, thank you!

    August 11, 2012 19:15
    • Eugene Kirpichev

      Thanks, this is exactly what I was hoping to achieve with my post!

      August 14, 2012 12:37
  3. Tamale

    I feel the need to spread knowledge that if you choose to use public IPs for your vms, you’ll need ip_fowarding enabled in /proc/sys/net/ipv4/ip_forward / /etc/sysctl.conf

    August 14, 2012 06:10
  4. Gennaro

    This post was incredible. I learn a lot of things. This post gave me a perfectly deeper look on openstack networking issues.

    just one more question… in this configuration, how can I grant access tu public network to my VMs?

    thanks

    August 16, 2012 03:28
    • Eugene Kirpichev

      Thanks! As for access to public network: in fact you should be able to access the outside network FROM the VMs if you can access it from your nova-network node (controller in this case). And to access VM from outside network (i.e. vice versa) you need to assign a floating IP (see subsequent Piotr’s posts).

      Can you show a particular problem you’re having with accessing the network?

      August 16, 2012 11:17
  5. Gennaro

    Thanks for reply (and, above all, for this post)!
    I’m explaining the situation.
    I reproduced the configuration of your example via devstack (single-host). I have a controller with nova-compute (and, naturally, nova-network daemon) and a compute node. Every host has 2 interfaces, eth0 is on the public network, eth1 is linked to br100.
    I have 2 vms belonging to the same tenant, and the internal ping works exactly as you described!
    If I try to ping a public IP (in my case, 8.8.8.8) from the vm hosted by the controller, it works.
    For the vm hosted by the compute outside network is unreachable.
    Maybe the vm can’t ping outside because I didn’t write correctly some iptables rules that grant to vms access to internet (i’m very noob of iptables!), or maybe my configuration is wrong.
    I’d like that outside vm traffic will pass through the controller node, using the public interface (eth0 – IP:192.168.1.2), but it would be sufficient also that this traffic will pass through the eth0 interface of the compute (IP:192.168.1.3).

    To get in the second case I tried, on the compute node, to use the iptables rule “iptables -t nat -I POSTROUTING -s 10.0.0.0/24 -j SNAT –to-source 192.168.1.3″, but didn’t success.

    Regards,
    Gennaro

    August 17, 2012 02:47
    • Eugene Kirpichev

      Hm, so, you have 2 nodes – 1) controller+nova-network+nova-compute and 2)nova-compute, right?

      I don’t think that solving your problem will involve writing your own iptables rules, it’s more likely due to misconfiguration. So I suggest you to remove the rule you introduced and debug why it isn’t working currently – using tcpdump and iptables tracing.

      Did you try tcpdump’ing the compute node’s and the controller’s interfaces when trying a ping from the compute node VM? How far do the packets reach?

      August 17, 2012 14:12
  6. doug smith

    Problem: We have one controller and two compute nodes. At this time we are unable to spin up new VMs. When we try, they will fail in the “networking” state. dnsmasq is running on the compute nodes and the controller. I am seeing this error in the logs.
    ERROR nova.rpc.common [-] Timed out waiting for RPC response: timed out

    Seeing no dhcp requests hitting the controller from the compute node when tailing /var/log/syslog.

    Previous to this issue we had our hard drive fill up on our controller. We got rid of 160 gigs. Restarted some vms. And then this issue started happening.

    Also we are unable to associate or disassociate ip at this time as well.

    Any input would appreciated.

    Thanks.

    August 29, 2012 09:36
    • Eugene Kirpichev

      Hi Doug, did you check if your nova-network service is actually running and listening to RabbitMQ? Can you show the last few lines from nova-network.log? For example, I had a similar problem when I ran into this bug https://bugs.launchpad.net/nova/+bug/1018586

      August 29, 2012 13:26
      • Doug

        I gave in and reinstalled everything on the controller.

        We have a new problem, related to dnsmasq. I noticed that the leasetime openstack supplied to it is 120s

        /usr/sbin/dnsmasq –strict-order –bind-interfaces –conf-file= –domain=novalocal –pid-file=/var/lib/nova/networks/nova-br100.pid –listen-address=10.50.0.1 –except-interface=lo –dhcp-range=10.55.0.2,static,>>>>120s <<<<<

        We have a three node cluster with one node being the controller and the others being compute nodes. Obviously because of this VMs will try to renew there ip addresses every 120 seconds. And as of right now nova-network is only running on the controller so lots of network traffic is moving through the controller. Anyone have any ideas on changing that lease time? I saw this article. https://bugs.launchpad.net/nova/+bug/894218
        But I really don't want to go changing python code. I will if I have to but I was wondering if anyone has any other suggestions.

        September 5, 2012 07:08
        • Eugene Kirpichev

          Hm, seems like that bug already has a fix released. Do you mean it hasn’t been propagated to the native package repository you’re using?

          Honestly, I don’t quite understand your concern about network traffic – the DHCP traffic would be negligible even if it were once per 1 second instead of 120; which traffic do you mean?

          September 5, 2012 10:10
          • Doug

            Changed the following in /usr/lib/python2.7/dist-packages/nova/network/linux_net.py

            cfg.IntOpt(‘dhcp_lease_time’,
            default=120,
            help=’Lifetime of a DHCP lease in seconds’),

            Changed default to 3600.

            Restarting nova-network won’t restart dnsmasq. I had to restart the entire controller, unless someone can tell me a quicker way to restart dnsmasq via openstack…

            September 11, 2012 05:55
          • Piotr Siwczak

            Doug,
            This is a known issue. The simplest is just to stop nova-network, kill all dnsmasq instances and start nova-network.

            September 11, 2012 11:26
  7. lallau

    Hi Eugene,
    first of all thanks for this great blog post, the best we can see about Nova networking mechanism.
    Nevertheless the part concerning “Ping outer world from VM” still confused for me.

    Routing decision are clear from VM physical network (10.0.0.0) to VM network default GW (10.0.0.1), but you just conclude with “no such ping will ever succeed, as VMs are only allowed to communicate with 10.0.0.0/24″.
    Which iptables rules are concerned?

    On my side if I try to follow the packet, I see the following sequence on the netword node firewall (controller):
    * NO PREROUTING rules in nat and filter table are matching.
    * the routing decision is made and FORWARD rules will be used but no one will match
    * POSTROUTING rules will be used:
    1) -A nova-postrouting-bottom -j nova-network-snat
    2) -A nova-network-snat -s 10.0.0.0/24 -j SNAT –to-source 192.168.56.200

    => hence packet will trigger the outer word, that’s why I have missed someting… :(

    Moreover in the following post from Piotr Siwczak http://www.mirantis.com/blog/vlanmanager-network-flow-analysis/ (scenario 2)
    it is explain that ping from VM to outside is working even with a simple fixed IP. I know that networking concept is not the same in this last post: VlanManager is used instead of flatDHCPManager and it’s a multi-host networking configuration, but I think it doesn’t matter concerning routing and firewalling decision.
    Could you tell me what I have missed?
    Best regards

    September 4, 2012 01:25
    • Eugene Kirpichev

      You’re right, this part of my post is plain wrong, which I realised after Piotr’s post but forgot to correct mine :-/

      September 4, 2012 09:39
  8. Doug

    Well, I gave in and reinstalled everything on the controller.

    We have a new problem, related to dnsmasq. I noticed that the leasetime openstack supplied to it is 120s

    /usr/sbin/dnsmasq –strict-order –bind-interfaces –conf-file= –domain=novalocal –pid-file=/var/lib/nova/networks/nova-br100.pid –listen-address=10.50.0.1 –except-interface=lo –dhcp-range=10.55.0.2,static,>>>>120s <<<<<

    We have a three node cluster with one node being the controller and the others being compute nodes. Obviously because of this VMs will try to renew there ip addresses every 120 seconds. And as of right now nova-network is only running on the controller so lots of network traffic is moving through the controller. Anyone have any ideas on changing that lease time? I saw this article. https://bugs.launchpad.net/nova/+bug/894218
    But I really don't want to go changing python code. I will if I have to but I was wondering if anyone has any other suggestions.

    September 5, 2012 07:09
    • Eugene Kirpichev

      (replied above)

      September 5, 2012 13:08
    • Jerry

      How did you changed the -listen-address to 10.50.0.1 for dnsmasq? My dnsmasq is always using 10.0.0.1 even I configured in nova.conf with “–fixed_range=10.100.0.0/22″, but when nova-network is started it always using 10.0.0.1 which is actually my physical gateway for my home network.

      nobody 3148 1 0 19:49 ? 00:00:00 /usr/sbin/dnsmasq –strict-order –bind-interfaces –conf-file= –domain=novalocal –pid-file=/opt/stack/data/nova/networks/nova-br100.pid –listen-address=10.0.0.1 –except-interface=lo –dhcp-range=set:’private’,10.0.0.2,static,120s –dhcp-lease-max=256 –dhcp-hostsfile=/opt/stack/data/nova/networks/nova-br100.conf –dhcp-script=/opt/stack/nova/bin/nova-dhcpbridge –leasefile-ro

      How did you change for dnsmasq? I could not find the config file…

      thanks,

      January 9, 2013 22:25
      • Piotr Siwczak

        Jerry,

        I will try to debug the problem further. Please, provide the output of these commands:

        ifconfig br100

        nova-manage network list

        In general, dnsmasq binds to the first address on the network created in nova (the command for this is nova-manage network create) with FlatDHCPManager this address is typically set on br100. Please, check if the above commands give you the addresses for 10.0.0.0 network. If yes, and you still want to switch to 10.100.0.0, you should delete existing openstack networks and dismantle the bridge. Then create the new network with nova-manage network create.

        January 10, 2013 11:16
        • Jerry

          Hi Piotr,
          Here is the results:
          openstack@openstack2:~$ nova-manage network list
          id IPv4 IPv6 start address DNS1 DNS2 VlanID project uuid
          1 10.0.0.0/24 None 10.0.0.2 8.8.4.4 None None None 2f3b9e17-4f20-4b05-842a-8d735fbb89cb

          openstack@openstack2:~$ ifconfig
          br100 Link encap:Ethernet HWaddr 00:50:56:87:49:f9
          inet addr:10.0.0.1 Bcast:10.0.0.255 Mask:255.255.255.0
          inet6 addr: fe80::acb8:d1ff:fe27:7345/64 Scope:Link
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B) TX bytes:90 (90.0 B)

          eth0 Link encap:Ethernet HWaddr 00:50:56:87:0c:0e
          inet addr:10.0.0.171 Bcast:10.0.0.255 Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:fe87:c0e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:6128 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4557 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1052414 (1.0 MB) TX bytes:1197248 (1.1 MB)

          lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:5238 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5238 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:630581 (630.5 KB) TX bytes:630581 (630.5 KB)

          virbr0 Link encap:Ethernet HWaddr 82:33:30:92:15:88
          inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

          You can see that the host IP is also sitting on 10.0.0.0/24, so whenever a new instance is created and dnsmasq will start listening and assigning DHCP IP addresses which causes network gateway conflicts.

          I ‘ve been tried to delete and here is the result – weird that it says network not found:
          openstack@openstack2:~$ nova-manage network delete 2f3b9e17-4f20-4b05-842a-8d735fbb89cb
          2013-01-10 15:40:39.257 INFO nova.network.driver [-] Loading network driver ‘nova.network.linux_net’
          2013-01-10 15:40:39.288 DEBUG nova.openstack.common.lockutils [-] Got semaphore “nova.servicegroup.api.new” for method “__new__”… from (pid=6216) inner /opt/stack/nova/nova/openstack/common/lockutils.py:185
          2013-01-10 15:40:39.288 DEBUG nova.servicegroup.api [-] ServiceGroup driver defined as an instance of db from (pid=6216) __new__ /opt/stack/nova/nova/servicegroup/api.py:49
          2013-01-10 15:40:39.289 DEBUG nova.openstack.common.lockutils [-] Got semaphore “nova.servicegroup.api.new” for method “__new__”… from (pid=6216) inner /opt/stack/nova/nova/openstack/common/lockutils.py:185
          2013-01-10 15:40:39.302 DEBUG nova.utils [req-753ab209-6d0b-472f-b954-88bd6730ce0b None None] Reloading cached file /etc/nova/policy.json from (pid=6216) read_cached_file /opt/stack/nova/nova/utils.py:1036
          Command failed, please check log for more info
          2013-01-10 15:40:39.599 CRITICAL nova [req-753ab209-6d0b-472f-b954-88bd6730ce0b None None] Network could not be found with cidr 2f3b9e17-4f20-4b05-842a-8d735fbb89cb.

          And /opt/stack/data/nova/networks/nova-br100.conf is a blank file, without any configurations in.

          BTW, I am running the Devstack on Ubuntu 12 which has the “reboot” bug and nova service won’t restart after system reboot.

          January 10, 2013 12:45
          • Jerry

            I figured it out something – when I install Devstack, I didn’t define Fixed_Range in my localrc file. I tried it again with Fixed_Range, then I found that after installation, Fixed_range is being written into nova.conf file, and it started to use the IP subnet that I wanted. But what I don’t understand is why the next time when I reboot devstack, it won’t use Fixed_range in nova.conf.
            Anyway, thank you very much!

            January 10, 2013 18:32
  9. Kevin, Lee

    Great Post to clarify the openstack networking mechanism. Thanks!

    One question is:
    I think that the compute node should have br100 bridge interface after VM created, but there is no bridge interface in the result of “ip a”. Why?

    October 8, 2012 23:05
    • Eugene Kirpichev

      Hi Kevin, I’m glad you liked it.

      Yes, the compute node *should* have br100. Is your VM actually working well, is it pingable?
      Which network configuration are you using – single-host or multi-host?
      Are you actually using nova-network or Quantum?

      October 9, 2012 13:01
  10. Chris Doherty

    This has been excellent so far. I’ve caught several little networking problems I didn’t even know I had.

    I still can’t get VMs to acquire IP addresses via DHCP, though. I’m using Ubuntu precise and the github Puppet Labs packages in a single controller, two compute node configuration. AFAICT everything is configured correctly on the nodes, but route -n does _not_ show the 169.254.0.0 route on any of the nodes.

    I can’t find anywhere in the puppet packages where this route is added; is it something that nova adds itself, or did you have to manually configure it on the lo interface(s) before installing OpenStack?

    October 14, 2012 00:46
  11. Tony

    Thanks for the greate post…I’ve deployed Essex with single host configuration. Hereafter is my problem: I can ping the VM but fail to SSH. The thing that raises me crazy is that SSH works for the cirros image but not for Ubuntu. I check the console.log and saw that VM can not fetch metadata from 169.254.169.254. What is the difference between cirros and ubuntu when downloading user metadata?
    Thanks,

    October 31, 2012 18:54
    • Eugene Kirpichev

      Hi Tony. Can you try VNC-ing into the machine and looking at the routes and iptables?

      October 31, 2012 23:52
      • Tony

        Thanks, Eugene Kirpichev,

        I have tried VNC-ing but I did not have the ubuntu password, I do a lot google but still fail to get the correct one. However, I think I get the hint for the solution. In fact, I have tried to tap the traffic on br100 (to which VM connects to) and saw that VM tries to get the MAC address of 192.168.100.1 withou success (my fixed address block is 192.168.100.0/24). For some unkown reason my br100 listens on 192.168.100.3. Given that, the VM could not talk to 169.254.169.254 (which will be forward back to Nova-API server). What I’m trying to do now is to configure dnsmasq so that it listens on 192.168.100.1. Any hint on that ?

        November 3, 2012 19:42
  12. Kyle Brandt

    I’ve been trying to get Flat DHCP working, but have hit some sort of block. I put lots of detail in (https://github.com/mseknibilel/OpenStack-Folsom-Install-guide/issues/14) including tcpdump etc. Looks like all my instances ignore DHCP offers, or maybe an arp issue?

    If you have ideas, would be much appreciated :-)

    November 11, 2012 05:35
  13. Patrick

    This is easily the best explanation of the networking I’ve seen to date, but I’m still struggling with this. Banging my head for days. Thank you very much! This really should NOT have been this complicated, disappointed in openstack walkthru docs. Thanks again.

    November 14, 2012 12:35
    • Eugene Kirpichev

      Thanks Patrick!

      November 15, 2012 20:13
      • Patrick

        Do you have any thoughts or links on this issue? I cannot access my external, floating IPs from any server except the hosting compute node. dashboard, nova-list, ip addr, iptables all show that everything is configured correctly but traffic never gets to the VM. I’ve verified that traffic is arriving at the hosting node using tcpdump. Without any errors to go on, I”m having a difficult time trouble shooting.

        November 16, 2012 08:35
        • Patrick

          NM, got it answered on IRC. I hadn’t enable ip forwarding, don’t if I missed or the docs did. Thanks again!

          November 16, 2012 08:51
  14. redwane

    I am not able to ping the instance even if it is in an active status.Any help

    November 29, 2012 07:57
  15. redwane

    actually when i issue arp -n Iget this :
    10.0.0.4 (incomplete) br100
    Please i need help

    November 29, 2012 07:59
    • Eugene Kirpichev

      Hi. Can you VNC into the instance and see how things look from its side? Also, from which machine are you trying to ping it, and how does tcpdump look like during that process? Also, try using iptables TRACE facility.

      November 30, 2012 03:25
  16. Aristocrate

    Open Stack, noting else than inconsistent tones of writings scattered over the net here and there, which is a perfect example of the anti-patern big ball of mud (see wiki).

    November 29, 2012 20:30
  17. Jacob Godin

    Great write up! Much clearer than the OpenStack documentation.

    Just a quick tip for any one testing out a setup in VMWare ESXi where your compute node is separate from your controller (as in this setup). If you’re unable to DHCP an address, and can’t access the outside world from your instances, make sure that your vSwitch has Promiscous Mode set to Accept.

    It was very frustrating finding this out after two days of diagnosing, but hopefully it will prevent others from making the same mistake!

    December 6, 2012 12:09
  18. Jacopo

    Hi, this is a very usefull post.

    But i have a particoluar configuration that i have to use in my accademic work:

    I have one controller where i run nova-api, nova-network and nova-scheduler; then i have to groups of machines that runs nova-compute, every one with their own subnetworks. Every machine can see the controller via another subnetwork, but the two subnetwork are not able to communicate to each other. Then host controller has two other interfaces, one for each subnetwork of compute-nodes. On the host-controller i’ve enabled ip_forwarding and masquerading the traffic from the two subnetworks, so the machine on different the two different subnetworks can see each other in 2 hops via host-controller. The service warks well, and i also can live-migrate intances from compute nodes in different subnetworks, but the problem is i cannot access the instances (ping or ssh or anything else) from the host controller.
    Because i need to access via ssh to the instances, how can i do it?

    P.S. I attach the bridge on the host controller on one of the two interfaces used for communicate with the subnetworks.

    December 16, 2012 06:01
    • Eugene Kirpichev

      Hi,

      From your description of the network configuration seems like things should work. I suggest using tcpdump and the iptables TRACE target to figure out where your pings of VMs from controller are disappearing.

      December 17, 2012 11:43
  19. Ori Tzoran

    Excellent work and kudos to Eugene for this well-written effort. I have 2 requests:
    1) In order to replay the setup presented here, could you please make available the complete contents of:
    - /etc/network/interfaces of the CC, CN1, CN2
    - /etc/nova/nova.conf for those nodes
    - the “nova-manage network create” used here.
    2) “Ping outer world” – reiterating lallau’s request to clarify this point.

    January 31, 2013 05:59
    • Andrea

      Interested in answer to point 1.
      I am trying a multi-node installatoon but nova network is not creating br100 and I cant’ understand why….

      February 13, 2013 10:35
  20. Ilya

    Great post. Have you used flatDHCPManager with vms with virtual IPs? We are having trouble in reaching cross machine through vip. Is it a limitation of dnsmasq?

    February 20, 2013 20:48
    • Piotr Siwczak

      Ilya,

      The way it should be used is to assign each vm a floating IP and place a floating IP pool under the load balancer. The main reason is: with flatDHCPManager a single IP pool is shared among many projects and after you destroy your vm, there’s a chance it gets assigned to a completely different application. Floating IPs are assigned to vm-s by their users and will always stay the same.

      March 12, 2013 01:32
  21. Barry Kreiser

    We have configured OpenStack with FlatDHCPManager but our VM’s that we spawn are being served addresses from our LAN DHCP server and not the dnsmasq service on the OpenStack. Any suggestions?

    March 14, 2013 16:57
  22. I.P

    Hello Eugene, thanks very much for this well written document. After reading through the document, I got the impression, that at a minimum I need two ethernet interfaces in the controller node eth1 and eth2. I have a server machine where I have only 1 physical ethernet port: eth1 and adding another port is going to be difficult. Is it possible to use a virtual ethernet port such as eth1.0 and use that instead of eth2? Are there any other mechanisms for configuring openstack networking with just one physical ethernet port? The VMs that I deploy must be able to connect to the external network via eth1. thanks in advance for any help!!

    April 5, 2013 10:58
  23. Nicolas

    Hello, and congratulations for this post and all other who are available on this website!
    I have a very similar installation with a two NICs controller. The first NIC (eth1) is used for management and Openstack services installation. The second NIC (eth2) is used with bridge br100 and has its own subnet (for example 192.168.0.0).
    My question is, what is my “public_interface” for floating-ip in this configuration ? eth1, eth2 or a third NIC (I have one available)?

    April 10, 2013 06:43
  24. gilank

    Hi, in my case, my log for per instance isn’t working or not show any log, but vm is work well, how to solve this?

    April 17, 2013 21:54
  25. Amogh Patel

    Hi,

    As you mentioned the subsequent topic for “Ping outer world from VM”, Could you please provide the link of that post?

    Thanks in advance.

    July 1, 2013 17:19
  26. vinayus

    i am getting this error while creating an instance… i used devstack…

    failed to load names from /opt/stack/data/nova/networks/nova-br100.hosts

    can someone help me out with this??

    July 27, 2013 00:45
  27. Peter Wang

    very good knowledge for openstack tutorial

    do you have some instruction that my vm bridged on br-ex cloud not get IP ?

    any idea?

    Thanks
    Peter

    February 7, 2014 00:49
  28. Vikas

    I wish I would have stumbled upon the blogs written by your earlier .. would have saved me lot of frustration .. These blog posts are “gem on the internet” on the Flat and Floating Networking topics.

    Thanks much and keep up the good work :)

    March 20, 2014 14:07

Continuing the Discussion

  1. OpenStack Community Weekly Newsletter (July 27-Aug 3) » The OpenStack Blog

    [...] By Mirantis: OpenStack Networking Tutorial: Single-host FlatDHCPManager [...]

    August 5, 201208:26
  2. Openstack的网卡设置 » 陈沙克日志

    [...] 这个很多文档都是这样设置,包括大名鼎鼎的培训机构http://www.mirantis.com/blog/openstack-networking-single-host-flatdhcpmanager/ [...]

    September 1, 201207:17
  3. blog@ewebs.org » Blog Archive » OpenStack Networking Tutorial: Single-host FlatDHCPManager

    [...] OpenStack Networking Tutorial: Single-host FlatDHCPManager. [...]

    June 18, 201314:43
  4. Getting Started with Heat, DevStack & Vagrant « Cloudsoft

    [...] Networking Tutorial: Single-host FlatDHCPManager: mirantis.com/blog/openstack-networking-single-host-flatdhcpmanager/ Share this [...]

    December 3, 201307:09

Some HTML is OK


or, reply to this post via trackback.