Adopting cloud storage technology requires a solid understanding of the business drivers behind the technology’s ROI and a clear strategy for evaluating cloud gateways, service-level agreements (SLAs) and security requirements. In these three Storage Decisions video presentations from our 2011 New York City event, watch cloud storage explained by veteran industry experts Marc Staimer, Howard Marks and Arun Taneja. While they each have their unique perspective into the world of private, public and hybrid cloud storage, there are some major points they agree on: Cloud storage can save you money, it’s still not a good bet for primary storage and everyone is working a cloud angle these days.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Marc Staimer, president, Dragon Slayer Consulting
Emphasis on scalability and object storage
Almost any discussion around cloud storage starts with scalability.
When Staimer talks about scalability, he's referring to “performance scalability.” That means storage that increases linearly as capacity scales so that throughput and IOPS per terabyte improves (or stays the same) as storage grows.
“It’s got to scale into the zettabytes and they have to do it with a single image. By the end of next year , you will see 100 exabyte single storage containers,” Staimer predicted. “The way cloud storage minimizes the container is through object storage. Every time you add a node, it positively increases capacity, performance and objects.”
For Staimer, any technical explanation of cloud storage starts with a solid understanding of object storage. Because object storage puts emphasis on individual chunks of data that are loosely federated, you don’t need to have a single or aggregated namespace governing all the data. Instead, you have a looser federation of individual data elements. This eliminates the need for cache coherency across the entire system, the need for every node to be aware of the objects owned by other nodes, and even the concept of ownership of a piece of data by a physical node.
As long as the data meets specified policies about how many copies it needs to have and where it can live, the system can grow and scale nearly indefinitely. “There is no cluster,” Staimer explained. “The nodes don’t need to be aware of what’s in every other node. Every piece of data has its own unique hash. This is what we mean by regional awareness. It’s like a hand-off system. It’s one of the reasons it’s designed not to be a primary storage, but a secondary storage. It doesn’t scale based on the storage system, it scales based on the data.”
Adding nodes online:
- New nodes are auto discovered and integrated into the system
- Older nodes can be removed from the system at leisure, online
- Each object system node can be a mix of old and new nodes
- Data is copied seamlessly
That process eliminates scheduled downtime, rip-and-replace strategies and the need for massive data migrations, he said. The bottom line, Staimer said, is that “cloud storage is much, much lower cost than traditional storage.”
But, he warned, not all cloud storage is created equal. “The [public] cloud is nothing more than someone else’s data center,’’ he said. And they're as vulnerable to hurricanes, floods, fires, so they better have multiple data sites with really good SLAs for you.”
Howard Marks, chief scientist, DeepStorage.net
Key criteria: Elasticity and location awareness
“I haven't done a product briefing with a vendor in the past year where they haven't talked about how their product is perfectly suited to the cloud,” Marks said, referring to the cloud washing phenomenon that can make it difficult to get a grasp on cloud storage. Still, Marks said, for “those of you drowning in files, especially those of you drowning in old, stale files, cloud storage may actually be the solution.”
For anyone looking to mimic a public cloud offering in their own storage environment, Marks suggested starting with a list of technical features a public cloud service offers. For data storage managers, the most likely scenario for considering any sort of private cloud means a hybrid cloud storage scenario, or the best of both worlds, Marks said.
Any combination of on-premises and Internet storage could be called hybrid cloud:
- Cluster on-premises replicates to public provider
- Atmos to Atmos (AT&T Synaptic Cloud)
- On-premises replicates to colocation cluster
- Gateway/archiving system writes to both
- Also used for dedicated infrastructure by public provider
But Marks issued some warnings to IT pros determined to incorporate a private cloud strategy: Data storage managers have to remember they'll have trouble shrinking, or eliminating, data stores and costs the way their public cloud account would allow them to.
“If I decide to store 4 TB for a month and then destroy it -- if I built the infrastructure to do that, I don’t stop paying for those disk drives because I deleted the data,” he said. “So if you're building a private cloud infrastructure, you do have to look for scalability. But scalability is different from elasticity because it doesn’t really shrink.”
One of the crucial points to building a private cloud infrastructure is data protection, or location awareness, Marks said. “I want to be able to create a policy that says I'm going to set up this storage system in four of my data centers, and when I write data to this system I want it to maintain three copies of the data and I want those three copies to be in at least two different data centers,” he said. "That’s so the system will manage the protection and replication for that data across multiple locations.”
That location awareness is a key criteria, Marks said, necessary for a product or system to qualify as cloud storage.
Arun Taneja, founder and consulting analyst, Taneja Group
'Disaster recovery-type capability' and public cloud limitations
Taneja believes one of the most compelling selling points behind a cloud storage service is what he calls “disaster recovery-type capability.”
“You automatically get DR-type capability because you are, by definition, going off your premises. You don’t need your second site. You don’t need a DR site. You don’t need a second office location,” Taneja said. “You're using cloud as a DR site; that’s probably the most important aspect of a cloud that I’ve come across.”
For those interested in integrating their on-site storage with a cloud storage service, Taneja believes cloud gateways, or appliances, are a natural starting point. “But many gateways that are available in the market right now are just simply gateways,” he said. “That means they're just something on the on-ramp for the cloud.
“They don’t do anything other than API translations or format changes,” Taneja warned. “They will take something at the enterprise level, and shove it out as the HTTP command level at the other side. But if the gateway is done appropriately, it can give you local-like performance at cloud-like prices."
Data storage managers need to be aware that they'll be responsible for setting their own internal SLAs if they choose to build private cloud storage offerings.
“It’s service oriented, easy to use and based on SLAs. That means if you're a user and you ask for SLA No. 1 [in public], you pay a certain price; you ask for SLA No. 2, you pay a different price,” Taneja said. "When you do a private cloud, you need to charge based on the SLAs the business division wants. Let the guys pay for higher SLAs if that’s what they choose.” Taneja also provided a list of public cloud limitations that are at risk of being ignored amid all the excitement about a technology often described as having no limits.
- Most legacy applications can't run in a public cloud
- Public clouds don't offer the guaranteed availability required for business-critical applications
- Security remains a top concern among users, although no major breaches have occurred to date
- Latency sensitivity
“If you’re going to store tier 1 data in the cloud [for applications you’re running in-house], make sure those applications aren't latency sensitive,” Taneja said. “When you talk about tier 1-type availability, you aren’t going to get that from a public cloud today. That’s why the No. 1 issue with public clouds right now is availability.
“Make sure that 99.9% availability doesn’t translate to 10 hours a month downtime for you,” he cautioned. “I’m a super believer in cloud, but I’m also a super believer of walking before running.”