I frequently get the same question from customers who say, “We heard this Ceph thing replaces all other storage. Can’t we use that for everything?”
I’ll be discussing Ceph vs Swift from an architectural standpoint at the OpenStack Summit in Vancouver, sharing details on how to decide between them, and advising on solutions including both platforms. For now, let’s look at some of their architectural details and differences.
A Closer Look
Swift has been around since the dawn of OpenStack time – which is a bare five years ago. It is one of the core projects of OpenStack and has been tested and found stable and useful time and again.
Trouble is, Swift’s design comes up short in both transfer speed and latency. A major reason for these issues is that the traffic to and from the Swift cluster flows through the proxy servers.
Another reason many people think Ceph is the better alternative is that Swift does not provide block or file storage.
Finally, latency rears its ugly head when object replicas aren’t necessarily updated at the same time, which can cause requesters to receive an old version of an object after the first write of the new version. This behavior is known as eventual consistency.
Ceph, on the other hand, has its own set of issues, especially in a cloud context. Its multi-region support, while often cited as an advantage, is also a master-slave model. With replication possible only from master to slave, you see uneven load distribution in an infrastructure that covers more than two regions.
Ceph’s two-region design is also impractical as writes are only supported on the master, with no provision to block writes on the slave. In a worst case scenario, such a configuration can corrupt the cluster.
Another drawback to Ceph is security. RADOS clients on cloud compute nodes communicate directly with the RADOS servers over the same network Ceph uses for unencrypted replication traffic. If a Ceph client node gets compromised, an attacker could observe traffic on the storage network.
In light of Ceph’s drawbacks, you might ask why we don’t just build a Ceph cluster that spans two regions? One reason is that Ceph writes only synchronously and requires a quorum of writes to return successfully.
With those issues in mind, let’s imagine a cluster with two regions, separated by a thousand miles, 100ms latency, and a fairly slow network connection. Let’s further imagine we are writing two copies into the local region and two more to the remote region. Now the quorum of our four copies is three, which means the write request is not going to return before at least one remote copy is written. It also means that even a small write will be delayed by 0.2 seconds, and larger writes are going to be seriously hampered by the throughput restriction.
On the other hand, Swift in the same two-region architecture will be able to write locally first and then replicate to the remote region over a period of time due to the eventual consistency design. Swift also requires a write quorum, but the write_affinity setting can configure the cluster to force a quorum of writes to the local region, so after the local writes are finished the write returns a success status.
So how do we decide between Ceph and Swift?
How To Choose
In a single-region deployment without plans for multi-region expansion, Ceph can be the obvious choice. Mirantis OpenStack offers it as a backend for both Glance and Cinder; however, once larger scale comes into play, Swift becomes more attractive as a backend for Glance. Its multi-region capabilities may trump Ceph’s speed and stronger consistency model.
In many cases, speed is not the deciding factor, with security being a bigger issue, and that favors Swift, with its closed-off replication network. On the other hand, if the cloud infrastructure is well-protected, security may be a lower priority, putting Ceph back in the running.
Rather than choosing one over the other, it may make sense to have both alternatives in the same cloud infrastructure. For example, you could use Ceph for local high performance storage while Swift could serve as a multi-region Glance backend where replication is important but speed is not critical. However, a solution with both components incurs additional cost, so it may be desirable to standardize on one of the options.
My personal recommendation from many customer engagements — Mirantis offers architectural design assessments to assist with the collection of requirements and parameters and provides a solution that fits individual use cases and business drivers — is a thorough assessment of all business, technical, and operational factors. You can then weigh the factors and check them against the capabilities and drawbacks of both options. And who knows? You may be surprised at the winner.
Of course, this is a pretty simplistic view of the topic. I will be discussing this topic in depth on Monday, May 18 at 5:30 at the OpenStack Summit in Vancouver. I’d love to know what you’d like to hear about; please let me know in the comments below.