Primary data storage options for the cloud
A comprehensive collection of articles, videos and more, hand-picked by our editors
Using object storage and commodity hardware is certainly one option for getting started in the cloud, but that is not always the best approach. Object repositories make a lot of sense when it comes to supporting the storage needs of new, cloud-architected applications that are written to a RESTful application programming interface. However, for existing applications, customers are usually better off pursuing other approaches, such as...
taking advantage of public cloud block storage services, cloud storage gateways, or third-party products that are closely aligned to one or more public clouds.
So, how do you determine which workloads and use cases are suitable for public cloud storage? The cloud may look attractive from a cost and scalability standpoint, but certain types of workloads and storage use cases are much better served in the cloud than others.
First, a bit of background. Public cloud storage options for business and enterprise customers have come a long way since Taneja Group first wrote about them in 2008. While Amazon remains the king of the hill, customers today have more providers and storage options to choose from, some of which have better functionality and reduced prices. Ongoing innovation and escalating competition are making cloud storage options more attractive than ever before.
Major public cloud providers serving the enterprise market today tend to offer three basic storage services: block storage tightly coupled with a cloud computing platform (e.g., Amazon Web Services [AWS] Elastic Block Storage [EBS], Hewlett-Packard [HP] Cloud Block Storage and Windows Azure Block Blob Storage); independent object storage (e.g. Amazon Simple Storage Service, HP Cloud Object Storage and Windows Azure Blob Storage); and a content delivery network service (such as Amazon CloudFront, HP Cloud CDN and Windows Azure CDN). The object storage options are primarily of interest to developers writing cloud-enabled apps based on next-generation frameworks. The block storage option, on the other hand, is primarily available to support the legacy storage needs of existing applications.
Alongside these basic cloud storage options, public cloud storage providers often offer additional storage-related services to meet the needs of particular use cases, such as relational and non-relational database offerings, database replication, long-term archival storage (e.g. Amazon Glacier), and cloud storage gateways (e.g. AWS Storage or Microsoft/StorSimple Cloud-integrated Enterprise Storage). A number of third-party cloud backup products are also available.
Given this backdrop, the workloads and storage use cases that are most suitable for the cloud will largely be dictated by your application and service-level requirements. For planning and evaluation purposes, it's useful to qualify your storage requirements based on two key dimensions: the value/sensitivity of your data, and the nature of your storage use case and workload.
Simply put, the greater the value and/or sensitivity, the higher the requirements for security and compliance. Though public cloud security has been steadily improving, it's not capable of addressing the storage needs of highly regulated data, which must meet stringent security, privacy and/or other compliance standards. Examples include sensitive medical and financial information that is subject to industry regulations, such as the Health Insurance Portability and Accountability Act, or Payment Card Industry Data Security Standards in the U.S. Many types of federal and state government data also fall into this category. If your data can't withstand even the slightest probability of a security breach, unauthorized access or data loss, then it shouldn't be stored in a public cloud.
Public cloud storage offerings typically can't deliver the level of performance and availability required by production applications; as a result, providers aren't willing to include anything beyond minimal availability commitments (and nothing to cover performance) in terms of their service-level agreements. AWS, for example, guarantees 99.95% availability for its Elastic Compute Cloud/EBS infrastructure (i.e., uptime for all but 21 minutes each month), but this guarantee can often be satisfied even if particular customer workloads crash, since availability is defined as at least one instance having external connectivity and at least one attached volume performing some read/write input/output.
The sweet spot for public cloud storage is the area in which data value/sensitivity is relatively low and the focus is on Tier-2 or Tier-3 use cases, such as dev/test, backup, disaster recovery or archiving. Public cloud storage provides the ideal solution in such cases, since storage performance, availability and security requirements are relatively relaxed. But customers can take advantage of the extreme scalability and compelling economics of the cloud.