Mirantis | The #1 Pure Play OpenStack Company

The Present and the Future of OpenStack Trove Architecture

This is the first in a series of blog posts discussing the OpenStack Trove project. Here we provide a taste of the concepts behind Trove, while future posts will dive into the installation, explain the conductor’s inner workings, discuss the Clustering API, and finally provide an in-depth treatment of relationships between Trove and various databases.


If you aren’t already familiar with the OpenStack Database-as-a-Service project, code-named Trove, it consists of three major components:

  1. Trove itself (on the server side, trove-api and taskmanager; on the VM-side, trove-guestagent).
  2. Python-troveclient.
  3. Trove integration (a combination of Trove and DevStack, designed for development and testing).

In this post, we’ll discuss how different Trove components interact to provide Database-as-a-Service, the current state of Trove architecture, and the goals for the future Trove architecture.

The current state of Trove implementation

The major Trove services are (for the current implementation, see Figure 1):

  • Trove API–A service that provides a RESTful API, which supports JSON and XML to provision and manage Trove instances. It’s a REST-ful component and an entry point (Trove/bin/trove-api).

    The Trove API uses a WSGI launcher configured by Trove/etc/trove/api-paste.ini. It defines the pipeline of filters–tokenauth, ratelimit, and sone–and the app_factory for the troveapp as trove.common.api:app_factory. The API class (a WSGI router) wires the REST paths to the appropriate controllers.

    The controller implementation is under the relevant module (versions/instance/flavor/limits), in the service.py module. Controllers usually redirect the implementation to a class in the models.py module.

    At this point, the API module of another component (taskmanager, guestagent, and so on) is used to send the request onward through RabbitMQ.

  • Trove taskmanager–A service that does the heavy lifting as far as provisioning instances, managing the lifecycle of instances, and performing operations on the database instance.

    The Trove taskmanager listens for RabbitMQ topics and serves as an entry point (Trove/bin/trove-taskmanager) for requests arriving through the queue. It runs as an RpcService configured by Trove/etc/trove/trove-taskmanager.conf.sample, which defines trove.taskmanager.manager.Manager as the manager.

    As described above, requests for this component are pushed to MQ from another component using the taskmanager’s API module using _cast() or _call() (sync/a-sync) and putting the method’s name as a parameter.

    RpcDispatcher.dispatch() in Trove/openstack/common/rpc/dispatcher.py invokes the proper method in the manager by some equivalent to reflection. The manager then redirects the handling to an object from the models.py module. It loads an object from the relevant class with the context and instance_id. The actual handling is usually done in the models.py module.

  • Trove guestagent–A service that runs within the guest instance, responsible for managing and performing operations on the database itself. The guestagent listens for RPC messages through the message bus and performs the requested operation.

    The Trove guestagent is similar to the taskmanager in the sense that it also listens for RabbitMQ topics. It runs on every DB instance, and a dedicated MQ topic is used (identified as the instance’s id). It serves as an entry point (Trove/bin/trove-guestagent) and runs as an RpcService configured by Trove/etc/trove/trove-guestagent.conf.sample, which defines trove.guestagent.manager.Manager as the manager.

    As described above, requests for this component are pushed to MQ from another component using the guest agent’s API module via _cast() or _call() (sync/a-sync) and putting the method’s name as a parameter.

    As in the taskmanager case, RpcDispatcher.dispatch() invokes the proper method in the Manager by some equivalent to reflection. The Manager then redirect the handling to an object (usually) from the dbaas.py module.

    The actual handling is usually done in the dbaas.py module.

  • Trove conductor–A service that collects the statuses from guestagents and writes them into the Trove back-end and interacts with the guestagent via RPC. The benefit of such an improvement is that there would be only one interaction channel between the taskmanager, guestagent, and conductor–AMPQ. With the conductor, the guestagent would be able to write statuses to the Trove back-end through RPC. The beta is in the review process.

Figure 1 The current Trove architecture

There is, however, something wrong with this architecture: the direct access from the VM to the Trove back-end. Problems occur when there is no connectivity from the VM to the host, where the Trove back-end works. As you may know, that’s a common OpenStack problem–without the connectivity among the services, there would be no magic.

The future of Trove architecture

The Trove community has designed a different architecture. Now it includes a new component, a Trove scheduler. This scheduler is a service for registering scheduled tasks (backup, restore, provisioning, and so on). It is still in design.

Figure 2 below represents the future Trove architecture. Trove will have a rather different structure–the provisioned instances will have no direct access to the Trove back-end.

Figure 2 Future Trove architecture

Future improvements are expected to turn Trove into a real database-on-demand solution like Amazon RDS.

Conclusion

Trove’s current architecture is immature. It’s characterized by tightly coupled modules and direct access to the Trove back-end from the provisioned VM. The future architecture improvements will bring Trove to the production-ready state.

We’re planning to share some more information about Trove in the upcoming posts. Please share your thoughts and observations with us.

6 comments
Google Plus Mirantis

6 Responses

  1. name

    Very confusing!

    Say you are provisioning an instance. What is the thing that each of task manager, scheduler, and conductor does?

    The names for the services are so bad, that they don’t really convey any meaning.

    December 3, 2013 17:57
    • Nick Chase

      Wow, you sure are passionate about this! Trove is a little complicated and I can see why you’re finding it a little confusing. This is meant to be more conceptual in nature, but in the coming weeks we’ll be posting content that’s more hands-on and concrete, so it should be a little easier to understand.

      Thanks!

      December 4, 2013 06:42
    • Denis Makogon

      If you are familiar with OpenStack then you should know that Nova has it’s own conductor – service which interacts with back-end.
      Trove-conductor does the same: collects Trove-instance statuses, backuping-process statuses, and performs every interactions with trove back-end.
      Scheduler does execution of scheduled tasks (that was mentioned in current topics)
      Taskmanager – service which interacts with OpenStack services (nova, cinder, heat, swift), also it does instance provisioning of stack, instance, volume.

      December 4, 2013 08:12
      • name

        I thought Scheduler was to pick the right resource? Nevermind, that was in Nova.

        December 4, 2013 15:00

Continuing the Discussion

  1. The Present and the Future of OpenStack Trove Architecture – Pure Play OpenStack. | OpenStackうぉっち

    [...] The Present and the Future of OpenStack Trove Architecture – Pure Play OpenStack.. 共有:TwitterFacebookGoogleいいね:いいね 読み込み中… カテゴリー: OpenStack   タグ: Trove   作成者: ntamaoki   この投稿のパーマリンク [...]

    December 2, 201319:39
  2. Trove的架构及功能分析 | rf.w's BlueSky

    […] (1)the-present-and-the-future-of-openstack-trove-architecture […]

    February 28, 201418:41

Some HTML is OK


or, reply to this post via trackback.