Why We Chose Ceph to Build Block Storage
In January 2013, DigitalOcean became one of the first cloud providers to offer SSD storage. For several years, a slice of the virtualization hypervisor's local drives provided this storage available to Droplets. This approach worked great but had its limitations, such as:
- Volume size and growth were limited by the hypervisor's complement of drives, which was shared with other Droplets.
- Storage was released once a Droplet was destroyed. The term “ephemeral” is sometimes used to describe this virtualization strategy.
- Storage volumes could not be easily moved or reattached to different Droplets.
For these and other reasons, we introduced Block Storage in July 2016. Since then, we’ve steadily increased capacity and have deployed into all service regions. In this post, we'll explore the underlying technology behind our Block Storage offering.
Creating Block Storage That Can Scale
In the past, portable, scalable block storage service was usually provided with a traditional SAN (Storage Area Network). These tended to be expensive and difficult to manage and upgrade. Scaling and upgrading could be difficult, and the architecture was susceptible to considerable vendor lock-in.
At DigitalOcean, we love and support open-source software. So when the time came to architect our Block Storage service, we used these guiding criteria:
- Open-source software, available to a wide community of users, testers, and developers
- Widespread deployment in production at scale
- Ease of scaling up and out
- Freedom from scalability barriers
- Freedom from vendor lock-in and product obsolescence
- Fault tolerance
- RAS: Redundancy, Availability, Serviceability
- Transparent maintenance and upgrade operations
- Strong protection of customer data integrity
The best-of-breed solution for all of these criteria is the leader in open and widely-adopted distributed storage: Ceph.
Ceph in Production
In the 15 years since Ceph began, it has steadily grown in popularity, performance, stability, scalability, and features. As GNU Lesser General Public License (LGPL) open-source software, Ceph enjoys a rich community of users and developers, including multiple DigitalOcean engineers who've contributed upstream code to the core Ceph project.
The RBD (RADOS Block Device) service provided by Ceph slots right into the popular KVM QEMU virtualization technology we employ. Droplets enjoy flexible block storage that is presented just like a local drive.
Our Ceph-backed Block Storage service is also 100% SSD-based. Ceph is built for redundancy, and we carefully ensure that the loss of a single drive, server, or even an entire data center rack does not compromise data integrity or availability.
Ceph gracefully heals itself when individual components fail, ensuring continuity of service with uncompromised data protection. Additionally, we use sophisticated monitoring systems built around tools including Icinga, Prometheus, and our own open-source ceph_exporter. These help us respond immediately to any issues with our Ceph infrastructure to ensure continuous availability.
Our Block Storage deployment into each new Droplet region brings hundreds of enterprise-class SSDs managed by the Luminous release of Ceph. We keep three copies of your data to ensure the highest data durability and availability. These replicas are carefully distributed across separate servers and racks to eliminate any single point of failure.
Each Ceph cluster's performance and utilization is carefully monitored so that we can add additional resources as needed. Ceph's flexibility allows us to expand existing storage clusters or even add new ones to a region completely transparently. We are also able to upgrade Ceph and complete other types of fleet-wide maintenance in a rolling fashion, without downtime or other impacts to our valued customers.
It is important to note however that this replication is entirely behind-the-scenes. It prevents us losing your Block Storage volume data, but does not protect your Droplet itself, nor does it allow recovery from accidental deletion on your end. Thus, backups of critical data are still important. See these articles for help on Block Storage volume snapshots and data backups:
- Introduction to DigitalOcean Backups
- Understanding DigitalOcean Droplet Backups
- Creating a Snapshot from a Block Storage Volume
And if you haven’t already, create your own Block Storage volume on DigitalOcean.
Anthony D’Atri is a veteran sysadmin who's been working with Ceph for four years, starting with the Dumpling release. He is the co-author, along with Vaibhav Bhembre, of Learning Ceph, which outlines architecting, deploying, and managing Ceph at scale. He enjoys photography and a never ending quest for exotic fruit. He lives in Portland, Oregon with his wife and son.