Improving Trove Architecture Design with Conductor Service

Denis Makogon - January 15, 2014 - ,

This is the second in a series of blog posts discussing OpenStack Trove. Trove is OpenStack’s Database as a Service (DBaaS) project, whose intent is to provide all of the capabilities of a relational database without the hassle of having to handle complex administrative tasks. Here we explain the current Trove design issues and our decision to create a conductor service for Trove.

The current Trove architecture design has one issue–the guest service inside a provisioned VM requires Trove back-end connectivity to update its status from BUILD to ACTIVE. Since Trove and OpenStack could differ, it’s possible that the guest service would not be able to communicate with the back-end. We present a solution that’s already well-known to OpenStack services–MQ. That’s why we designed and created a conductor service for Trove called Trove-conductor.

The Trove Conductor is a service that runs on the host and is responsible for receiving messages from guest instances to update the information on the host, for example, an instance status or current backup status. When you have a Trove Conductor, guest instances do not require a direct connection to the host’s database. The conductor listens for RPC messages through the message bus and performs the corresponding operation.

The conductor is like the guestagent in that it is a service that listens to a RabbitMQ topic, with the difference that the conductor lives on the host and not the guest. The MQ service has several topics, and each topic is a queue of messages. Guest agents communicate with the conductor by putting messages on the topic defined in the conductor configuration file as the conductor_queue. The conductor reads a topic named trove-conductor (conductor_queue=trove-conductor), and that’s the default topic.

The service start script (Trove/bin/trove-conductor) works as follows:

  • It runs as RpcService configured by Trove/etc/trove/trove-conductor.conf.sample, which defines trove.conductor.manager.Manager as the manager. This is the entry point for requests arriving into the queue.The script helps deployer understand which parameters should be listed in the real trove-conductor.conf. Note that the trove-conductor.conf.sample contains a set of paramerets that are required to start theconductor service.

  • Just as in the case of the guestagent, requests are pushed asynchronously to the MQ from another component using _cast, generally in the form of {"method": "", "args": {}}. These are call() and cast() methods, which take the name of, and map the parameters for, the method that will be executed.

  • trove/conductor/manager.py does the actual database update–The “heartbeat” method updates the status of an instance. It is a method that checks if the database service is running and then sends the status (ACTIVE, FAILED, ERROR, or SHUTDOWN) to the conductor in response to the executed method. It reports if an instance has changed from NEW to BUILDING to ACTIVE, and so on.

  • The update_backup method changes the details of a backup, including its current status, size, type, and checksum.

Conductor integration

In phase 1 of conductor integration, the guestagent service communicates with the back-end through the conductor service. The guestagent service casts specific calls, such as heartbeat and update_backup to the conductor service. Each of these tasks requires a db connection for persisting models. The schema is show in figure 1.

The benefit of such a solution is that the compute instance and the Trove backend are not connected. That means that the backend can work in another network/datacenter and that the guest service would not be able to communicate with it directly, which means that the guest can’t update its own status inside the Trove back-end. After a predefined timeout, the taskmanager service marks the instance with an ERROR status, which means that the guest is broken .

Conductor service

Figure 1 Phase 1 conductor usage schema

In phase 2 of conductor integration, the conductor service takes tasks from the taskmanager. The conductor becomes a single entry point for tasks requiring a db-connection for the guestagent and taskmanager services. As mentioned earlier, the conductor is a service that takes from the guest the tasks that require a db-connection. The conductor is an executor that has a direct connection to the back-end.

Conductor service (1)

Figure 2 Phase 2 conductor usage schema

Phase 2 assumes that the deployer can launch more than one instance of the conductor service.

With the db-connection in a single place, the persistance models are now managed by two services–the trove-api and the conductor itself.

In phase 3, the conductor becomes the single entry point for any tasks that require db-connection, including the API service.

The remaining part of integration are db interactions for the trove-api service. Figure 3 shows those interactions.

Conductor service (2)

Figure 3 Phase 3 conductor usage schema

After such changes, the conductor can deal with everything related to back-end–the models’ CRUD operations. Let’s take a look at the final scheme.

Conductor service (3)

Figure 4 Final conductor usage schema

As you can see from last schema, the conductor becomes a single entry point for the models’ CRUD operations. Such an architectural design allows all Trove services to communicate with the conductor service over the MQ service (any kind of AMPQ protocol implementation). This makes for a sound decision when it comes to planning for the future.

Summary

The Trove Conductor is a service that runs on the host and is responsible for receiving messages from guest instances to update the information on the host. It breaks the connectivity with the Trove back-end, obliterating the problem of unreachable back-end host. The conductor service becomes a single entry point for any back-end required tasks.

banner-img
From Virtualization to Containerization
Learn how to move from monolithic to microservices in this free eBook
Download Now
Radio Cloud Native – Week of May 11th, 2022

Every Wednesday, Nick Chase and Eric Gregory from Mirantis go over the week’s cloud native and industry news. This week they discussed: Docker Extensions Artificial Intelligence shows signs that it's reaching the common person Google Cloud TPU VMs reach general availability Google buys MobileX, folds into Google Cloud NIST changes Palantir is back, and it's got a Blanket Purchase Agreement at the Department of Health and Human …

Radio Cloud Native – Week of May 11th, 2022
Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!

In the last several weeks we have released two updates to Mirantis Container Cloud - versions 2.16 and 2.17, which bring a number of important changes and enhancements. These are focused on both keeping key components up to date to provide the latest functionality and security fixes, and also delivering new functionalities for our customers to take advantage of in …

Where do Ubuntu 20.04, OpenSearch, Tungsten Fabric, and more all come together? In the latest Mirantis Container Cloud releases!
Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]

Cloud environments & Kubernetes are becoming more and more expensive to operate and manage. In this demo-rich workshop, Mirantis and Kubecost demonstrate how to deploy Kubecost as a Helm chart on top of Mirantis Kubernetes Engine. Lens users will be able to visualize their Kubernetes spend directly in the Lens desktop application, allowing users to view spend and costs efficiently …

Monitoring Kubernetes costs using Kubecost and Mirantis Kubernetes Engine [Transcript]
FREE EBOOK!
Service Mesh for Mere Mortals
A Guide to Istio and How to Use Service Mesh Platforms
DOWNLOAD
WHITEPAPER
The Definitive Guide to Container Platforms
READ IT NOW
LIVE WEBINAR
Manage your cloud-native container environment with Mirantis Container Cloud

Wednesday, January 5 at 10:00 am PST
SAVE SEAT