Mirantis OpenStack

  • Download

    Mirantis OpenStack is the zero lock-in distro that makes deploying your cloud easier, and more flexible, and more reliable.

  • On-Demand

    Mirantis OpenStack Express is on demand Private-Cloud-as-a-Service. Fire up your own cloud and deploy your workloads immediately.

Solutions Engineering

Services offerings for all phases of the OpenStack lifecycle, from green-field to migration to scale-out optimization, including Migration, Self-service IT as a Service (ITaaS), CI/CD. Learn More

Deployment and Operations

The deep bench of OpenStack infrrastructure experts has the proven experience across scores of deployments and uses cases, to ensure you get OpenStack running fast and delivering continuous ROI.

Driver Testing and Certification

Mirantis provides coding, testing and maintenance for OpenStack drivers to help infrastructure companies integrate with OpenStack and deliver innovation to cloud customers and operators. Learn More

Certification Exam

Know OpenStack? Prove it. An IT professional who has earned the Mirantis® Certificate of Expertise in OpenStack has demonstrated the skills, knowledge, and abilities needed to create, configure, and manage OpenStack environments.

OpenStack Bootcamp

New to OpenStack and need the skills to run an OpenStack cluster yourself? Our bestselling 3 day course gives you the hands-on knowledge you need.

OpenStack: Now

Your one stop for the latest news and technical updates from across OpenStack ecosystem and marketplace, for all the information you need stay on top of rapid the pace innovation.

Read the Latest

The #1 Pure Play OpenStack Company

Some vendors choose to “improve” OpenStack by salting it with their own exclusive technology. At Mirantis, we’re totally committed to keeping production open source clouds free of proprietary hooks or opaque packaging. When you choose to work with us, you stay in full control of your infrastructure roadmap.

Learn about Our Philosophy

Implementing Encryption Architecture with Cisco Webex for OpenStack Swift object storage

on September 28, 2012

This post is the first of an ongoing series on our engagement with Cisco Webex on their OpenCloud platform.

One of the requirements for data center security is protection of “at rest” data. Usually this protection is about encrypting client-generated contents, including objects stored in the Swift cluster. In most cases, clients themselves could carefully encrypt their data; however, this requires the client to establish and support encryption infrastructure. A cloud provider can create value by offering transparent server-side on-disk encryption.

We have been working on this design as a part of our current engagement with Cisco Webex. Their requirements include encryption of data stored on Swift devices, and clear separation of code to simplify code base maintenance. These are the requirements we aim to address with our proposal at the forthcoming OpenStack design conference for the forthcoming Grizzly release of OpenStack. (If you’re attending the conference, Cisco is presenting an overview of our work at the Wednesday morning General Session, 17 October.)

Basic concepts

I’ll refer to three entities frequently—encrypted object, encryption key, and key storage—so let me briefly define them:

Encrypted object

This is a data BLOB stored in OpenStack Swift that has been encrypted using a symmetric encryption algorithm with an encryption key. Metadata stored for the encrypted object allows identification of the key string used for encryption, but the key itself is stored separately.

Swift servers read and write objects to disk not in a single piece, but in chunks. Thus, every chunk must be encrypted separately and written to disk already encrypted.

Encryption key

An encryption key is a randomly generated string that is used by the encryption algorithm to encrypt and decrypt the data chunk. Each key has a number of parameters, but the most important is the key ID, which is a unique identifier. The Swift Object Server stores the key ID in the metadata of the encrypted object. This allows decryption of the object before sending it to the client upon receipt of a read request.

Keys can have different scopes. The simplest case is to use one key to encrypt all objects in the cluster. Special key parameters can tie a key to a single tenant, or even a single user in the tenant.

An encryption key can have a limited lifespan and can become inactive, or expire, after a specified period of time. When this happens, a new key must be generated to write new objects. Inactive keys are kept to provide access to previously written objects.

To provide protection from compromised keys, it is possible to implement rekey. This process asynchronously re-encrypts all objects in a particular tenant with a new key generated as a part of rekey process.

Key store

To ensure security of data, we need to keep the keys in special store that is not part of the Swift cluster. Key stores can vary greatly in nature, and your particular choice always depends on the requirements and level of security you need. For the purposes of this article we’ll assume that keys are stored in a simple database, for example, MySQL. This database should be separated from the Swift cluster physically and logically, and be only accessible through the SSL network.

Let’s list the data requirements for a key store, based on our definition of an encryption key. Each key has to:

  • be identified by unique key IDs in UUID format,
  • be user-specific or tenant-specific,
  • have a creation time stamp to provide expiration capability, and
  • have a Boolean parameter that shows whether or not the key is active.

These requirements can be satisfied with a database table with these fields:

Table 1. Keys table
Field Name Type Size
key_id string 32
key_string string 32
created_at timestamp -
active boolean -
tenant_id string 32

Object encryption and decryption

Two Swift server entities participate in the process of object transfer to and from the client: the proxy server and the object server. It’s obvious that encryption should be performed on one of these servers. Our first thought was to implement it on the proxy server. However, this would expose keys on the publicly available servers, and, even more important, would force us to download a file from the object server and cache it to perform any operation on it (for example, a rekey operation). So, it seems that the object server performing encryption leads to simpler and more transparent data flows.

Our final version of the proposal includes two additional components to Swift application. Both of them are middlewares for existing servers. Our implementation plan assumes that encryption middlewares is distributed and maintained independently from the main Swift code base.

  • First is keymanage middleware for the proxy-server application. This middleware is responsible for identification of keys for the object server, depending on the authZ context of request.
  • Second is crypto-object-server middleware for the object-server application. This middleware performs actual encryption, decryption, and file system I/O, and effectively replaces object-server for certain requests.

In the following sections, I’ll detail how all these new components interact with existing ones and how data is passed and transformed inside the system.

Encryption

The object server encrypts a new object during upload to prevent wrong ETag calculation (because the ETag stored in metadata must match the file’s MD5 hash). It also solves the problem of an unencrypted temporary file on the device. Proxy and object servers process objects chunk by chunk, and each chunk gets encrypted by the object server before it is written to disk. The server returns HTTP 201 once all chunks are encrypted, written to a temporary file, and the file is placed in the final destination partition.

This diagram shows a more detailed view of the moving parts involved:

This diagram includes the following steps:

  1. Client sends a PUT request as usual. This request initiates object transfer through the proxy server.
  2. The proxy server requests an authorization from Keystone with the user’s token using auth_token middleware.
  3. If the user is an authorized member of the tenant, the proxy server passes the request to the keymanage middleware, which is implemented in the scope of this proposal. This middleware connects to the key store, identifies the key_id of the active key and includes it in the X-Object-Meta-Encryption-Key-ID header of the request. If no active key is found, this header is set to a special value, for example, the “undef” string. After this header is added, the request is passed to the proxy-server application. X-Object-Meta-* headers are passed and stored with object without modifications, so we don’t need to change the proxy-server code.
  4. The proxy server initiates an internal Swift PUT request to the object servers defined by Ring. The crypto-object-server middleware intercepts requests that contain X-Object-Meta-Encryption-Key-ID header and performs encryption of the object using the key specified by the key ID.
  5. The crypto-object-server middleware retrieves the key directly from the key store. If X-Object-Meta-Encryption-Key-ID has a special value that indicates that no active key was found in the key store (e.g., “undef”), the crypto-object-server generates a key from a random string and sends it to the key store.
  6. Each object server encrypts the object using the key string from the previous request and puts the object in the destination partition directory. X-Object-Meta-Encryption-Key-ID goes to object’s metadata, the key string is discarded from memory and the object server sends a response to the proxy server with a 200 OK HTTP code, and the proxy server confirms successful upload to the client application.

Decryption

The object decryption process is similar. The object server reads the object’s body chunk by chunk and decrypts each chunk before it sends this chunk to the proxy server. ETag is updated for every decrypted chunk, and thus ETag for the whole decrypted object received by the client matches the object’s MD5 hash.

Actually, download of an encrypted object is a less complicated process in this model, because the key ID  used for decryption is stored with an object. Thus, no interaction between the proxy server and the key store is necessary.

Here is an overview of steps taken by Swift to decrypt and send an object to the client upon request:

  1. The client downloads an object from Swift with a GET request to the proxy server.
  2. The Swift proxy server checks the user authorization with Keystone.
  3. The proxy server requests the object from the object server defined by Ring.
  4. The crypto-object-server middleware of the object server reads the object’s metadata, learns the ID of the key used to encrypt the object from the X-Object-Meta-Encryption-Key-ID  attribute and requests the key string with this ID from the key store.
  5. The key store sends the key string in response, and crypto-object-server middleware uses this string to decrypt the object and send it back to the proxy server as usual. The proxy server sends the object to the client.

Rekey

The Rekey process decodes and encodes every object from the rekeyed tenant on each object server, chunk by chunk. The main complication of rekeying is that it can affect very large number of objects, and thus can last for much longer time then any client can wait for the response. So, the idea is that the crypto-object-server triggers an asynchronous rekey process over the whole list of objects stored in the object server. A list of objects must be composed on the proxy server, as multiple containers have to be parsed for that.

Another complication brought up by rekeying is a possible race with replication. A workaround for this race condition is simple: Update the time stamp of the object upon starting rekeying. This ensures replication won’t overwrite an already processed object with an original one.

Conclusion

There are still several open questions in this proposal to discuss. First of all, the nature of the key store and interface to it are not obvious. Our assumption is that it is a simple SQL database for initial implementation. The plugin mechanism lets us use different specialized key storage solutions; the encryption algorithms are also configurable.

Another significant problem is the asynchronous nature of the rekey process. This may require creation of a service to check the status of rekeying. Rekey status check can be implemented as a separate proxy server middleware (similar to Recon). This middleware must parse the metadata of all objects in the account and detect if key IDs are inconsistent.

Webex and Mirantis are introducing this proposal as a part of our joint activity at the Grizzly Design Summit.

6 comments

6 Responses

  1. Malini

    Well thought solution and written.
    Would the encryption in this design happen at each object replication site for a single object? (Swift uses rsync to transfer data, which in conjunction with encrypted data would result in a larger payload than actual change in an “update”. If few updates, a single encryption point adequate).
    Will this be part of OpenStack? How does one get involved? Much thanks.

    December 5, 2012 18:34
  2. Caitlin Bestler

    What advantage does this have over self-encrypted drives?

    Each object server has the capacity to decrypt objects, so there is no added security over self encrypted drives. But you are adding an additional component that could fail.

    December 18, 2012 15:09
  3. tao.cai

    This is a good design.
    But, I don’t know what kind of security situation that you need to encrypt these data.
    Perhaps I have some misunderstanding here.
    For my understanding.
    no encryption designing: Object-server can not be accessed from the external network, So the data in the inner network is safe enough if there is no physical attack or some unauthorized user can access into the inner network.

    For this encryption designing: Object-servers are responsible for encrypting these data, So if someone can physically attack these server, The data pass through the network is still plain text. If someone has a root user in the storage nodes server, (For example a administrator of these storage nodes want to hack these data) , He can easily ask for a key from the object-server and get the key to decrypt these data.
    The only thing this encryption can protect is that the disks are removed by one administrator who can access these data physically(for example he takes one home)
    But the data in the network is still unsafe.

    So I want to know what kind of unsafe situation you are going to deal with.

    Thanks.

    March 22, 2013 01:28
  4. Edward

    if we have 3 replicas for each object, will each replica be separately encrypted/decrypted? which means 3 times encryption/decryption workload from my understanding.

    January 9, 2014 22:42

Continuing the Discussion

  1. OpenStack Community Weekly Newsletter (Jan 18 – 25) » The OpenStack Blog

    [...] few months ago, Mirantis engineers described the design of on-disk per-user encryption of objects in Swift. They released a first working prototype of this feature. It’s still very [...]

    January 25, 201307:16

Some HTML is OK


or, reply to this post via trackback.