OpenStack Swift and the hash_path_suffix – what can go wrong?

Christian Huebner - September 25, 2014 -

OpenStack Swift uses hash values to store objects. Hashing uses a mathematical algorithm to transform data, for instance a string, into a numeric representation. If the underlying data changes, the hash changes, so hashing can be used to detect changes in the data.

Swift uses the well-known MD5 hashing algorithm to transform the path of a Swift object into a hash value. A segment of the hash generated from the path to the inbound or requested object is used to specify the partition used to store the object. The complete hash is then used to position the object inside this partition.

In a perfect hash, every possible input string would be represented by a unique hash value, but the hash function used in Swift can not be perfect. The MD5 hash Swift uses is 16 bytes long and represents arbitrary-length strings, so there’s no guarantee that two different strings have different hash representations.

Malicious use of hash collisions

Using non-unique hash values for placing objects bears a risk. It’s possible for two objects with different object paths to translate into the same hash, and thus be stored in the same place in Swift. In that case, the second object stored overwrites the first.

This flaw means that malicious attackers can hand craft an object path whose hash matches the hash of the object path of the object they want to replace with their copy. The object they are inserting — even though it has a totally different object path from the original — would then overwrite the original object. A user retrieving the original object would not be able to tell that it’s been replaced.

Protecting the hash

Early on, the Swift developers realized that this attack vector posed a substantial risk and implemented protection against this form of attack. The hash_path_suffix is defined in the configuration files for swift on each storage node. It must be the same across the whole cluster and, just as importantly, must be kept secret.

To calculate the hash with the hash_path_suffix, the suffix is added to the object path of the requested object, and then the hash is made from the resulting string. Thus, if an attacker tries to insert an object with the same hash as an existing file, the hashes of the strings modified with the hash_path_suffix will not match, because the secret hash_path_suffix is required to create the hash.

This means that without knowing the hash_path_suffix,  it is no longer possible for an external attacker to knowingly create an alternate URI that produces the same hash as the original.

(There is still a tiny risk that the hashes will still match because of the general risk of MD5 collisions. This risk, though, is exceedingly small and does not provide a viable attack vector for the malicious attacker.)

The risk of misconfiguration

Unlike the rings, the hash_path_suffix cannot change over the life of the cluster. New nodes that are added must use the same hash_path_suffix as the pre-existing nodes.

Now what would happen if a node was added with a different hash_path_suffix?

For an explanation we must look at the auditors. Auditors are processes that constantly scour the data space of a Swift cluster, comparing hash values and MD5 sums with the desired values to detect both corruption of copies and objects that are written in the wrong place.

For our consideration, the hash comparison is of interest. The auditor compares the hash of the object path plus the hash_path_suffix with the hash from the location of the object. As the hash_path_suffix on the object is incorrect on our newly added server, the auditor will remove the “broken” copy from its location and store it in a quarantine space.

How to fix what is broken

Once the fault is detected by the cluster operator — usually through log messages — the hash_path_suffix must be corrected and all swift services on the defective node restarted. Once that’s done, internal replication can sync a good copy of the respective objects into their correct positions from another node.

It’s important to note that while this process can even work if multiple hosts have objects quarantined with the incorrect hash_path_suffix, it does require that there’s at least one good copy of the object remaining to be synced once the configuration has been corrected.

If the number of defective storage nodes is equal to or larger than the replication factor, some objects will have all copies quarantined. In this case, Swift can not automatically replace the missing objects, because no good objects are left to replicate.

In this case, you have two options for recovery:

It is possible to reinsert the objects manually into at least one correct location, and then let the normal internal replication sync. This requires a significant amount of calculation to determine the correct positions, and as internal replication is not designed to be fast, will also take a significant amount of time until consistency is reached.

Often it will be easier to extract the objects from the quarantine and upload them again with the original object path. Not only will this transfer the burden of location calculation to the cluster, but it will also quickly write at least a quorum of objects and thus better protect you from losing the object again, for instance due to hardware failure before internal replication can be achieved.

The easiest method, though, is prevention. Scripting the install, or at least copying the configuration files instead of manually editing them, helps eliminate this kind of mishap, and keeps valuable data safe.

Conclusion

The hash_path_suffix mechanism is useful for protecting a Swift cluster from a certain class of malicious attacks, but at the price of additional configuration effort. As with other OpenStack components, automation of installation and configuration significantly reduces the risk misconfiguration. Recovery is possible, but time consuming and prevention should be prioritized.
banner-img
From Virtualization to Containerization
Learn how to move from monolithic to microservices in this free eBook
Download Now
How is Cloud Native Changing the Landscape of Edge and 5G? [Recording]

Late last year, Mirantis hosted a Cloud Native and Coffee panel featuring CTO Adam Parco, Global Field CTO Shaun O’Meara, Director of Technical Marketing Nick Chase, and special guest Darragh Grealish, CTO of 56K Cloud. Below are highlights of the discussion that touch on what edge is and how developers can bring cloud native innovation to edge computing and 5G. …

How is Cloud Native Changing the Landscape of Edge and 5G? [Recording]
Moving to Cloud Native: How to Move Apps from Monolithic to Microservices

Enterprises face the challenge of consistently deploying and managing applications in production, at scale. Fortunately, there are more technologies and tools available today than ever before. However, transitioning from a traditional, monolithic architecture to a cloud native one comes with its own unique challenges. Below, you will find a list of the critical first steps you need to take when …

Moving to Cloud Native: How to Move Apps from Monolithic to Microservices
Mirantis Newsletter - January 2022

Every month, Mirantis sends out a newsletter chronicling top industry and company news. Below you’ll find links to blogs, tutorials, videos, and the latest updates to our enterprise, open source, and training offerings. If you don’t currently receive the newsletter, you can subscribe by clicking the button on the top right. Mirantis Brings Secure Registries to Any Kubernetes Distro Launched earlier this …

Mirantis Newsletter - January 2022
WHITEPAPER
The Definitive Guide to Container Platforms
READ IT NOW
LIVE WEBINAR
Getting started with Kubernetes part 2: Creating K8s objects with YAML

Thursday, December 30, 2021 at 10:00 AM PST
SAVE SEAT
LIVE WEBINAR
Istio in the Enterprise: Security & Scale Out Challenges for Microservices in k8s

Presented with Tetrate
SAVE SEAT