OpenStack Swift software enables users to store data on inexpensive commodity server hardware, but the default setting calls for three replicas of every object to be stored in separate locations, requiring extra storage capacity. Erasure coding would offer customers the chance to reduce the number of servers and drives they need to buy.
Erasure codes work by breaking the data into fragments, expanding it and encoding it with redundant data pieces; the system stores the data in different locations, such as disks, storage nodes or geographic locations. Data can be recovered in its original form from a subset of the fragments.
The development community announced the initiative last week. Box Inc., EVault Inc., Intel Corp. and SwiftStack Inc. are key collaborators on the open source development effort to add erasure coding to Swift object storage. The companies are seeking ways to drive down operational costs and capital expenses while retaining high degrees of durability, and Intel has already worked on some of the initial proposals and prototypes, according to Joe Arnold, CEO of OpenStack-powered cloud storage vendor SwiftStack.
Arnold said the goal will be to allow erasure-encoded data and replicated data to coexist in the same cluster. Replicas will remain the default setting in Swift since that approach works better at a smaller scale, according to Arnold. The replica model requires less CPU effort, taxes the network to a lesser degree and offers simpler failure-handling than the erasure-coding approach, he said.
"We love the replica model. It's operationally simple, low-latency and highly available. This pairs up nicely with the use cases of many production Swift clusters," Arnold wrote in an email. "But why not provide replicas when there is an advantage and erasure codes to save space when the demands on the data are less intensive?"
The main downside of erasure coding is the CPU overhead associated with encoding the data on writes and decoding it on reads. But on the flip side, David Floyer, chief technology officer at Wikibon, a community-focused research and analyst firm in Marlborough, Mass., said if the compute power is adequate, a user can recover data in a much faster and more accurate way with erasure coding, withstanding multiple losses of the system.
"Do you want to use processing power as a cheaper way of providing protection than extra disks? That's really the tradeoff," Floyer said, noting that Moore's Law will continue to reduce the cost of processing power.
He said the introduction of erasure coding represents a significant enhancement for OpenStack Swift object storage and opens the door for independent software vendors (ISVs) to use more types of applications with it. He expects to see ISVs try the technology and be in a position to increasingly use it with new applications over the next five to 10 years.
"Early adoption is not going to be fantastic," Floyer predicted.
Ashish Nadkarni, a research director in the storage systems practice at Framingham, Mass.-based International Data Corp., said vendors of commercial object-based storage products often "pooh-poohed" OpenStack Swift due to its lack of support for erasure coding. OpenStack Swift poses a threat to object storage vendors such as Amplidata, Cleversafe Inc., EMC Corp. and Scality Inc., he said.
Although erasure coding represents good news for proponents of Swift object storage, potential users will need to wait for the actual implementation to come through before they get too excited because "there are different ways to do erasure codes, and ultimately the devil is in the details," Nadkarni said.
The OpenStack Swift announcement did not specify a timetable for the completion of the erasure code work. SwiftStack's Arnold said only that the last major development effort, for a globally replicated cluster, kicked off last September and finished in July of this year. He said erasure coding also represents a "major leap for Swift."
Once the project is complete, OpenStack Swift developers plan to provide recommendations on when to use erasure codes, taking into consideration workload, data lifecycle and file size, Arnold said.