Mopic - Fotolia
The Tahoe Least-Authority File System, or Tahoe-LAFS, is an open source cloud storage option designed to address common security and reliability concerns with storing data in public clouds. Just as a RAID array stripes data across multiple disks, Tahoe-LAFS stripes data across multiple cloud storage providers. Security is improved because individual cloud storage providers store only data fragments. Tahoe-LAFS also enhances reliability because data is stored with sufficient redundancy to guard against the failure of one or more providers.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Data storage redundancy is achieved through a technique known as erasure coding. Erasure coding is based around the idea that it is possible to specify the total number of drives (or, in this case, cloud providers) that can fail without impacting the functionality of the file system.
Erasure coding uses the variables K and N. K refers to the number of providers required to be functional at any given time, while N is the total number of providers used. Hence, recovery goals can be expressed as K of N. Put into practice, each of your N cloud providers will store a volume of data that is equal to the total size of your data set divided by K.
To further illustrate this concept, let's examine the default Tahoe parameters in which K=3 and N=10. These values, which can be changed, specify that 10 different cloud service providers are being used, and that up to seven of them can fail at any given time. Conversely, three providers must remain online for the file system to remain functional.
Now suppose you needed to store 1 TB (1,024 GB) of data in the cloud (using the default Tahoe-LAFS parameters). Each of the 10 cloud providers will need to store enough data to insulate against the failure of any seven servers. The volume of data that must be stored on each server is the total size of the data set (1,024 GB) divided by K (3). In this case, that would mean that each of the 10 cloud providers would have to store approximately 341.3 GB of data.
It is important to consider what this level of reliability does to your storage costs. Cloud storage providers charge based on the volume of data being stored (some also charge for input/output). Using the example above, the redundancy requirements would triple the total volume of data being stored in the cloud (3,413 GB spread across 10 providers instead of 1,024 GB stored on a single provider).
Varying approaches to open source clouds
Open source cloud options expand
Dig Deeper on Cloud Storage Management and Standards
Related Q&A from Brien Posey
Setting up Office 365 generally involves multiple devices. With nonpersistent VDI, the rules of the game change for IT admins.continue reading
Much has been said about the inability to scale storage separately from other resources in a hyper-converged system, but are there any advantages to ...continue reading
The definition of hyper-converged infrastructure has evolved as the technology has grown. But the phrase still means different things depending on ...continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.