Welcome to Mirantis OpenStack Training’s monthly Q&A section. Here our instructors field questions about all aspects of OpenStack, and every month we’ll be sharing some of those answers with you here on the blog. If you have a question that you would like a Mirantis technical instructor to answer, feel free to post your comments in the section below. We will do our best to cover your question in next month’s post.
What’s the best architecture for multiple OpenStack component databases? Should they be co-located or can they be on separate nodes?
Most OpenStack components store state in an SQL database. By design, the databases do not have to be on the same database server. Each component is designed independently of other components. This allows for various components to point to a separate physical database, or to a database server that is hosting the database for other components. However, for operational efficiency the recommended best practice is that the databases should be hosted on the same database server.
Let’s take a closer look at the details of what all that means.
Typically OpenStack components store their respective state in an SQL database and they access the database using the OpenStack Oslo library. The Oslo library, in turn, uses the python SQLAlchemy library. In theory, then, OpenStack can support any SQL database that SQLAlchemy supports.
Because the components are independent projects, they have their own configuration files, such as
/etc/nova/nova.conf, and so on, and the database locations are defined in these individual files files.
For example, the database entry in nova.conf might look similar to the following:
[database] connection =mysql+pymysql://user:nova@<ip-address-of-database>//nova?charset=utf8
While the entry in cinder.conf might look similar to:
[database] connection =mysql+pymysql://user:cinder@<ip-address-of-database>//cinder?charset=utf8
The database location is specified by the IP address. Because each database is specified separately, each component can point to a different location. You can also use different kinds of databases for each component. For example you might have a situation in which Neutron uses SQLite, Nova uses MySQL, and Cinder uses PostgreSQL.
For practical purposes however, it is best to use a single database node or cluster and configure the components to point that database. This is advantageous from an operations and maintenance point of view, because it gives you fewer database servers to manage. The advantage is even more evident when using a database cluster to provide high availability, rather than a single server.
The most common database used by OpenStack deployment tools is MySQL/MariaDB. Most deployment tools will also install a database cluster, usually with 3 servers. In this case, the primary HA component of the cluster is Galera, a tool that works with a MySQL/MariaDB cluster to provide data synchronization between database servers.
You’ll also need other tools such as Pacemaker/Corosync to present a single IP address, a virtual IP (VIP), to access the database cluster. A component accesses the database via the VIP and stores the data in whichever database the VIP points to at that moment, then Galera copies the data to the other db servers.
Are you required to do it this way? Of course not; OpenStack is designed to be flexible and modular so it can work with your own specific situation. But current best practices recommend using a single database server or a cluster of database servers to provide high availability, enabling you to start with the most stable, easiest to manage architecture and take advantage of the greater flexibility the OpenStack design allows if the need arises over time.