Implementing Encryption Architecture with Cisco Webex for OpenStack Swift object storage
This post is the first of an ongoing series on our engagement with Cisco Webex on their OpenCloud platform.
One of the requirements for data center security is protection of "at rest" data. Usually this protection is about encrypting client-generated contents, including objects stored in the Swift cluster. In most cases, clients themselves could carefully encrypt their data; however, this requires the client to establish and support encryption infrastructure. A cloud provider can create value by offering transparent server-side on-disk encryption.
We have been working on this design as a part of our current engagement with Cisco Webex. Their requirements include encryption of data stored on Swift devices, and clear separation of code to simplify code base maintenance. These are the requirements we aim to address with our proposal at the forthcoming OpenStack design conference for the forthcoming Grizzly release of OpenStack. (If you're attending the conference, Cisco is presenting an overview of our work at the Wednesday morning General Session, 17 October.)
I'll refer to three entities frequently—encrypted object, encryption key, and key storage—so let me briefly define them:
This is a data BLOB stored in OpenStack Swift that has been encrypted using a symmetric encryption algorithm with an encryption key. Metadata stored for the encrypted object allows identification of the key string used for encryption, but the key itself is stored separately.
Swift servers read and write objects to disk not in a single piece, but in chunks. Thus, every chunk must be encrypted separately and written to disk already encrypted.
An encryption key is a randomly generated string that is used by the encryption algorithm to encrypt and decrypt the data chunk. Each key has a number of parameters, but the most important is the key ID, which is a unique identifier. The Swift Object Server stores the key ID in the metadata of the encrypted object. This allows decryption of the object before sending it to the client upon receipt of a read request.
Keys can have different scopes. The simplest case is to use one key to encrypt all objects in the cluster. Special key parameters can tie a key to a single tenant, or even a single user in the tenant.
An encryption key can have a limited lifespan and can become inactive, or expire, after a specified period of time. When this happens, a new key must be generated to write new objects. Inactive keys are kept to provide access to previously written objects.
To provide protection from compromised keys, it is possible to implement rekey. This process asynchronously re-encrypts all objects in a particular tenant with a new key generated as a part of rekey process.
To ensure security of data, we need to keep the keys in special store that is not part of the Swift cluster. Key stores can vary greatly in nature, and your particular choice always depends on the requirements and level of security you need. For the purposes of this article we’ll assume that keys are stored in a simple database, for example, MySQL. This database should be separated from the Swift cluster physically and logically, and be only accessible through the SSL network.
Let’s list the data requirements for a key store, based on our definition of an encryption key. Each key has to:
- be identified by unique key IDs in UUID format,
- be user-specific or tenant-specific,
- have a creation time stamp to provide expiration capability, and
- have a Boolean parameter that shows whether or not the key is active.
These requirements can be satisfied with a database table with these fields:
Table 1. Keys table
Object encryption and decryption
Two Swift server entities participate in the process of object transfer to and from the client: the proxy server and the object server. It’s obvious that encryption should be performed on one of these servers. Our first thought was to implement it on the proxy server. However, this would expose keys on the publicly available servers, and, even more important, would force us to download a file from the object server and cache it to perform any operation on it (for example, a rekey operation). So, it seems that the object server performing encryption leads to simpler and more transparent data flows.
Our final version of the proposal includes two additional components to Swift application. Both of them are middlewares for existing servers. Our implementation plan assumes that encryption middlewares is distributed and maintained independently from the main Swift code base.
- First is keymanage middleware for the proxy-server application. This middleware is responsible for identification of keys for the object server, depending on the authZ context of request.
- Second is crypto-object-server middleware for the object-server application. This middleware performs actual encryption, decryption, and file system I/O, and effectively replaces object-server for certain requests.
In the following sections, I'll detail how all these new components interact with existing ones and how data is passed and transformed inside the system.
The object server encrypts a new object during upload to prevent wrong ETag calculation (because the ETag stored in metadata must match the file's MD5 hash). It also solves the problem of an unencrypted temporary file on the device. Proxy and object servers process objects chunk by chunk, and each chunk gets encrypted by the object server before it is written to disk. The server returns HTTP 201 once all chunks are encrypted, written to a temporary file, and the file is placed in the final destination partition.
This diagram shows a more detailed view of the moving parts involved:
This diagram includes the following steps:
- Client sends a PUT request as usual. This request initiates object transfer through the proxy server.
- The proxy server requests an authorization from Keystone with the user's token using auth_token middleware.
- If the user is an authorized member of the tenant, the proxy server passes the request to the keymanage middleware, which is implemented in the scope of this proposal. This middleware connects to the key store, identifies the key_id of the active key and includes it in the X-Object-Meta-Encryption-Key-ID header of the request. If no active key is found, this header is set to a special value, for example, the "undef" string. After this header is added, the request is passed to the proxy-server application. X-Object-Meta-* headers are passed and stored with object without modifications, so we don't need to change the proxy-server code.
- The proxy server initiates an internal Swift PUT request to the object servers defined by Ring. The crypto-object-server middleware intercepts requests that contain X-Object-Meta-Encryption-Key-ID header and performs encryption of the object using the key specified by the key ID.
- The crypto-object-server middleware retrieves the key directly from the key store. If X-Object-Meta-Encryption-Key-ID has a special value that indicates that no active key was found in the key store (e.g., "undef"), the crypto-object-server generates a key from a random string and sends it to the key store.
- Each object server encrypts the object using the key string from the previous request and puts the object in the destination partition directory. X-Object-Meta-Encryption-Key-ID goes to object's metadata, the key string is discarded from memory and the object server sends a response to the proxy server with a 200 OK HTTP code, and the proxy server confirms successful upload to the client application.
The object decryption process is similar. The object server reads the object’s body chunk by chunk and decrypts each chunk before it sends this chunk to the proxy server. ETag is updated for every decrypted chunk, and thus ETag for the whole decrypted object received by the client matches the object’s MD5 hash.
Actually, download of an encrypted object is a less complicated process in this model, because the key ID used for decryption is stored with an object. Thus, no interaction between the proxy server and the key store is necessary.
Here is an overview of steps taken by Swift to decrypt and send an object to the client upon request:
- The client downloads an object from Swift with a GET request to the proxy server.
- The Swift proxy server checks the user authorization with Keystone.
- The proxy server requests the object from the object server defined by Ring.
- The crypto-object-server middleware of the object server reads the object’s metadata, learns the ID of the key used to encrypt the object from the X-Object-Meta-Encryption-Key-ID attribute and requests the key string with this ID from the key store.
- The key store sends the key string in response, and crypto-object-server middleware uses this string to decrypt the object and send it back to the proxy server as usual. The proxy server sends the object to the client.
The Rekey process decodes and encodes every object from the rekeyed tenant on each object server, chunk by chunk. The main complication of rekeying is that it can affect very large number of objects, and thus can last for much longer time then any client can wait for the response. So, the idea is that the
crypto-object-server triggers an asynchronous rekey process over the whole list of objects stored in the object server. A list of objects must be composed on the proxy server, as multiple containers have to be parsed for that.
Another complication brought up by rekeying is a possible race with replication. A workaround for this race condition is simple: Update the time stamp of the object upon starting rekeying. This ensures replication won't overwrite an already processed object with an original one.
There are still several open questions in this proposal to discuss. First of all, the nature of the key store and interface to it are not obvious. Our assumption is that it is a simple SQL database for initial implementation. The plugin mechanism lets us use different specialized key storage solutions; the encryption algorithms are also configurable.
Another significant problem is the asynchronous nature of the rekey process. This may require creation of a service to check the status of rekeying. Rekey status check can be implemented as a separate proxy server middleware (similar to Recon). This middleware must parse the metadata of all objects in the account and detect if key IDs are inconsistent.
Webex and Mirantis are introducing this proposal as a part of our joint activity at the Grizzly Design Summit.