How small is too small? A minimal OpenStack testbed
When I started working on OpenStack I wanted to test all kinds of things and had no budget to build even a scaled-down test environment. And I know many engineers often test changes on mission critical systems in production because of their organization’s financial constraints, even at the real risk of downtime, lost revenue, and unhappy customers. Not to mention that testing in production is not as thorough as it could be. It doesn't have to be this way.
An inexpensive testing solution?
An OpenStack testbed does not have to be a full-size representation of the cluster. The architecture should be the same, but the number of compute and storage nodes, the specs of the hardware, and the networking infrastructure can be scaled down. A cloud with a hundred compute nodes and 25 Swift or Ceph storage nodes of 20TB each can be represented by a mini cloud with the same number of controllers and Swift proxies, but with only 5-10 compute nodes and five Swift or Ceph nodes at 10% of the total cost of the cloud. This is a great solution when you consider that a single large outage in a production environment can eat up the cost of a relatively small testbed.
Of course some things cannot be adequately tested with a scaled-down testbed, such as the behavior of a large number of nodes under load or bottleneck testing and similar issues that are only seen at scale. If testing such issues you may need a full-scale replica of the existing environment; however, the vast majority of configuration changes and failure scenarios can be successfully tested in a miniature environment.
The scaled-down environment is still too expensive?
If a scaled-down version of a real environment is still too expensive, you can test specific issues in a smaller environment. For example, you can use a single-controller configuration instead of an HA configuration.
So what is the minimum size of a physical environment? For testing Nova, a single controller and one compute node can be sufficient. If a high load on the compute node is expected, you can use SSDs on the storage side, LACP on the network side, and more memory on the system side to alleviate the worst bottlenecks.
A Swift cluster can consist of a single storage node and a single proxy, which can reside on the same hardware. You can install Ceph alongside proxies or controllers, a technique Mirantis OpenStack employs to provide Ceph for internal storage without additional nodes.
Unlike an architecturally similar environment, a stripped down environment does not lend itself to running the same configuration you would run in production, so you will have to hand-prep changes tested in the small environment and plan for the impact of the scaled down version.
Still not cheap enough?
Looking at virtualization options I wondered how far I could get with a single node. I bought what I would normally consider a gaming machine - a tower case with a beefy power supply, a mainboard with a single AMD 8-core CPU, 32GB of memory, two 500GB SSDs, and an additional GbE network card. Total cost was roughly $1000. VMWare ESXi provided a number of VMs that I bound together into clouds.
Initially designed as a learning tool, this machine has seen a lot of test cases, and other developers have borrowed it on occasion. Mirantis OpenStack is preinstalled on one of the VMs, ready to deploy whatever cluster I may need for research. The machine is still sitting under my desk, waiting for the next test case.
And don’t forget one of the biggest advantages of virtualization. Before doing something potentially destructive you can make a snapshot of a VM and reinstate it if something does go awry.
I would never have believed that such a small and seemingly restricted machine would be able to do so much good in a world where scale is of paramount importance, but with a little imagination and ingenuity this mini testbed lets me try out a lot of things that would have cost time and money if I had wanted to set them up in a test environment.