Implementing Encryption Architecture with Cisco Webex for OpenStack Swift object storage
September 28, 2012
This post is the first of an ongoing series on our engagement with Cisco Webex on their OpenCloud platform.
One of the requirements for data center security is protection of “at rest” data. Usually this protection is about encrypting client-generated contents, including objects stored in the Swift cluster. In most cases, clients themselves could carefully encrypt their data; however, this requires the client to establish and support encryption infrastructure. A cloud provider can create value by offering transparent server-side on-disk encryption.
We have been working on this design as a part of our current engagement with Cisco Webex. Their requirements include encryption of data stored on Swift devices, and clear separation of code to simplify code base maintenance. These are the requirements we aim to address with our proposal at the forthcoming OpenStack design conference for the forthcoming Grizzly release of OpenStack. (If you’re attending the conference, Cisco is presenting an overview of our work at the Wednesday morning General Session, 17 October.)
I’ll refer to three entities frequently—encrypted object, encryption key, and key storage—so let me briefly define them:
This is a data BLOB stored in OpenStack Swift that has been encrypted using a symmetric encryption algorithm with an encryption key. Metadata stored for the encrypted object allows identification of the key string used for encryption, but the key itself is stored separately.
Swift servers read and write objects to disk not in a single piece, but in chunks. Thus, every chunk must be encrypted separately and written to disk already encrypted.
An encryption key is a randomly generated string that is used by the encryption algorithm to encrypt and decrypt the data chunk. Each key has a number of parameters, but the most important is the key ID, which is a unique identifier. The Swift Object Server stores the key ID in the metadata of the encrypted object. This allows decryption of the object before sending it to the client upon receipt of a read request.
Keys can have different scopes. The simplest case is to use one key to encrypt all objects in the cluster. Special key parameters can tie a key to a single tenant, or even a single user in the tenant.
An encryption key can have a limited lifespan and can become inactive, or expire, after a specified period of time. When this happens, a new key must be generated to write new objects. Inactive keys are kept to provide access to previously written objects.
To provide protection from compromised keys, it is possible to implement rekey. This process asynchronously re-encrypts all objects in a particular tenant with a new key generated as a part of rekey process.
To ensure security of data, we need to keep the keys in special store that is not part of the Swift cluster. Key stores can vary greatly in nature, and your particular choice always depends on the requirements and level of security you need. For the purposes of this article we’ll assume that keys are stored in a simple database, for example, MySQL. This database should be separated from the Swift cluster physically and logically, and be only accessible through the SSL network.
Let’s list the data requirements for a key store, based on our definition of an encryption key. Each key has to:
These requirements can be satisfied with a database table with these fields:
Table 1. Keys table
Object encryption and decryption
Two Swift server entities participate in the process of object transfer to and from the client: the proxy server and the object server. It’s obvious that encryption should be performed on one of these servers. Our first thought was to implement it on the proxy server. However, this would expose keys on the publicly available servers, and, even more important, would force us to download a file from the object server and cache it to perform any operation on it (for example, a rekey operation). So, it seems that the object server performing encryption leads to simpler and more transparent data flows.
Our final version of the proposal includes two additional components to Swift application. Both of them are middlewares for existing servers. Our implementation plan assumes that encryption middlewares is distributed and maintained independently from the main Swift code base.
In the following sections, I’ll detail how all these new components interact with existing ones and how data is passed and transformed inside the system.
The object server encrypts a new object during upload to prevent wrong ETag calculation (because the ETag stored in metadata must match the file’s MD5 hash). It also solves the problem of an unencrypted temporary file on the device. Proxy and object servers process objects chunk by chunk, and each chunk gets encrypted by the object server before it is written to disk. The server returns HTTP 201 once all chunks are encrypted, written to a temporary file, and the file is placed in the final destination partition.
This diagram shows a more detailed view of the moving parts involved:
This diagram includes the following steps:
The object decryption process is similar. The object server reads the object’s body chunk by chunk and decrypts each chunk before it sends this chunk to the proxy server. ETag is updated for every decrypted chunk, and thus ETag for the whole decrypted object received by the client matches the object’s MD5 hash.
Actually, download of an encrypted object is a less complicated process in this model, because the key ID used for decryption is stored with an object. Thus, no interaction between the proxy server and the key store is necessary.
Here is an overview of steps taken by Swift to decrypt and send an object to the client upon request:
The Rekey process decodes and encodes every object from the rekeyed tenant on each object server, chunk by chunk. The main complication of rekeying is that it can affect very large number of objects, and thus can last for much longer time then any client can wait for the response. So, the idea is that the
Another complication brought up by rekeying is a possible race with replication. A workaround for this race condition is simple: Update the time stamp of the object upon starting rekeying. This ensures replication won’t overwrite an already processed object with an original one.
There are still several open questions in this proposal to discuss. First of all, the nature of the key store and interface to it are not obvious. Our assumption is that it is a simple SQL database for initial implementation. The plugin mechanism lets us use different specialized key storage solutions; the encryption algorithms are also configurable.
Another significant problem is the asynchronous nature of the rekey process. This may require creation of a service to check the status of rekeying. Rekey status check can be implemented as a separate proxy server middleware (similar to Recon). This middleware must parse the metadata of all objects in the account and detect if key IDs are inconsistent.
Webex and Mirantis are introducing this proposal as a part of our joint activity at the Grizzly Design Summit.5 comments
Continuing the Discussion