Primary data storage options for the cloud
A comprehensive collection of articles, videos and more, hand-picked by our editors
A new wave of hybrid cloud storage appliances is attracting attention because the products take aim at the thorniest problems that cause IT shops to remain skittish about transferring their precious primary data to an off-site cloud storage provider over an unsecured public network.
Also known as cloud gateways, the physical and virtual appliances do more than simply provide support for proprietary APIs to cloud storage services. They also tackle worries that latency, network outages or dropped connections could suspend access to business-critical data, and address concerns about storing data securely in the cloud.
First and foremost, hybrid cloud storage appliances supply an on-premises cache that allows them to retrieve the latest or most active data from the local system rather than from the cloud. Algorithms figure out which data the users are most likely to need.
Secondly, deduplication and/or compression reduce bandwidth consumption and lower fees for over-the-wire data transfers and cloud storage capacity. So, in those cases where users may need to recover data from the cloud, they're pulling only deduplicated data, which rehydrates at the local appliance or storage system.
Prominent hybrid cloud offerings also deliver data encryption for security and feature extras such as snapshots to back up data and lessen the demand on overtaxed backup systems.
"They make the public cloud just another tier of storage," said Adam Couture, a research director at Stamford, Conn.-based Garter Inc., adding that hybrid cloud storage appliance vendors "want you to think of them as alternatives to a midrange or low-end storage array, like [an EMC Corp.] Clariion."
The early cloud gateway market is dominated by startup vendors. They include CTERA Networks Ltd., Panzura Inc., StorSimple Inc. and TwinStrata Inc. on the hardware side. TwinStrata also offers a software option, as does Nasuni Corp. with its virtual NAS appliance that requires customers to have an on-premises storage system for local caching.
Analysts: Cloud product growth predictions
Even though one startup with a hybrid cloud appliance, Cirtas Systems Inc., last month pulled back from the market and laid off much of its staff, some of the industry analysts who track the market aren’t interpreting that turn of events as a signpost of the technology's future outlook.
Hopkinton, Mass.-based Taneja Group Inc. predicts the nascent cloud gateway appliance market will grow to slightly more than $400 million in 2014, from a mere $11 million at the end of 2010. Taneja's emerging market forecast, completed in January 2011, focuses on appliances that serve as general-purpose storage, from primary I/O to archived data, and excludes devices created specifically as backup targets, such as Riverbed Technology Inc.'s Whitewater.
Jeff Boles, a senior analyst, director of validation services at Taneja Group, said his research firm factors into its models the prospect that a certain percentage of vendors will fail. He acknowledges the “black eye” left by the demise of Cirtas in April, Amazon.com Inc.’s major outage with its Elastic Compute Cloud (EC2) and Elastic Block Store (EBS) services, and Iron Mountain Inc.’s decision to shutter its file and archiving cloud storage services.
But Boles said cloud products continue to hold out such “tremendous value” over the long haul that Taneja Group’s market predictions aren't even especially aggressive. He predicts archival and near-line storage will drive adoption during the next 12 to 18 months, and then primary storage will see an uptick in two years for uses such as distributed access, "follow the sun" work processes and disaster protection of critical workloads.
Terri McClure, a senior analyst at Enterprise Strategy Group in Milford, Mass., suggests a more conservative time frame. She said most early adopters are using hybrid cloud storage appliances as backup targets and estimates it will take three to five years for them to catch on for primary storage, largely for unstructured data, user directories, content distribution and collaborative activities.
“Primary cloud storage isn’t quite ready for prime time yet, and that’s one of the reasons Cirtas had some problems. Unstructured data is a better candidate for off-site storage” than the primary block-based storage on which Cirtas focused, McClure said. “Of all the cloud stuff, they were taking on the hardest angle. We have a hard enough time with primary storage inside the data center, and they were going to store it off-site.”
Early adopters of hybrid cloud storage appliances remain hopeful
Yet, even the early demise of Cirtas hasn’t soured some of the vendor’s early adopters from the notion of hybrid cloud storage appliances.
“We’re not leaving the concept, but we’re definitely leaving that site and company,” said David Jones, IT operations manager at Alexza Pharmaceuticals Inc. in Mountain View, Calif. “The concept to us, which is block-level and file-level storage displaced out to the cloud, is still something we need. We’ve just got to find someone else that can do it.”
Alexza Pharmaceuticals had used the Cirtas appliance for archives and for data from underutilized and orphaned storage. The company planned to expand its use of the hybrid cloud appliance, in connection with a shift to a faster 100 Mbps WAN, to handle primary storage of non-business-critical data, such as user and group folders and scratch pads for early stage development work.
“The thing was going nowhere near all our high-priority systems,” Jones insisted.
But Jones had expressed hopes of shifting 20% of the company’s least important primary data to Amazon.com’s Simple Storage Service (Amazon S3) cloud storage service to ease the load on its NetApp Inc. FAS3020s and slow the need to purchase additional disk shelves.
The main problem Jones had with the hybrid cloud appliance was bandwidth. The initial data uploads from the Cirtas appliance to Amazon’s S3 over a T1 line were painful, he said.
“Simply put, if you throw a lot of data at it but don’t give it a big enough pipe, it’s not going to be happy,” Jones said.
Timothy Seto, a storage administrator at Gilead Sciences Inc. in Foster City, Calif., said his anxiety over sending primary data to the cloud had more to do with the uncontrollable nature of the Internet and his company's WAN link than with the buggy Cirtas Bluejet Cloud Storage Controller that the biotechnology company was testing with backup and archive data.
Seto chalked up the product’s unreliability to immaturity, and although he won’t proactively seek out a Cirtas-like replacement, he said he would consider another hybrid cloud storage appliance if a good one should cross his path. He indicated he would even leave the door open to using it for primary storage, although he’ll reserve any recommendations on hybrid cloud appliances strictly to backups and archives.
“The idea makes sense. There’s a market for it. I just don’t know if everyone’s ready,” he said.
Because Gilead Sciences' primary storage needs exceeded the limits of the Cirtas appliance’s 5 TB local cache, Seto knew he wouldn’t be able to guarantee that every piece of application data a user might need would be in the cache. If the Internet connection or cloud service failed, the user might be out of luck.
"We don't want the end-user experience to suffer just because we're trying to save some money on storage," he said.
Rather than rely on Cirtas’ algorithms to determine the most frequently accessed data to store in the local cache, Seto preferred a policy option to allow him to exert some level of control over the volumes or data stores to keep in cache.
George Crump, founder and president of analyst firm Storage Switzerland LLC, said the level of intelligence for determining which data is stored in the local cache and in the cloud varies by vendor and by application.
Nasuni CEO Andres Rodriguez contends his company's file-based virtual appliance has a better sense of what to keep in cache and what to send out to the cloud than the block-based products. The NAS appliance recognizes the actual files users are working on, as opposed to ferreting out the blocks that might be "all over the map," he said.