Enterprises are keeping their eyes on cloud providers to get a future glimpse of how they might solve some of their more vexing data center challenges. Many enterprises are being inundated with a tsunami of unstructured data that grows each day and requires longer retention periods. Since cloud providers have already experienced the first waves, there are four lessons enterprises can glean from their struggles and use to help them build...
their own private cloud infrastructures
Lesson 1: Architecture
A cloud storage provider has to cost-effectively provide storage to its subscribers, and an enterprise needs to do the same for its users. There are two keys to this architecture:
- It is built on commodity hardware and the infrastructure often leverages internal server-class storage. These servers and their storage are then clustered so that the capacity of each can be aggregated into a single storage pool. Expansion of capacity is done by adding another server to the cluster.
- These storage clusters tend to have object-based file systems running on them instead of a traditional block or file system. The object nature of these designs is important as it provides many of the features enterprises are looking for when they try to solve their own unstructured data problems.
Lesson 2: Unlimited file count
Most data centers today will not reach the file count limits of an enterprise network-attached storage system, but they may exceed the capabilities of a Linux NFS or Windows SMB server, which are more likely to be used for cost-effective storage of unstructured data. Similar to cloud providers, the file count concern will grow worse within enterprises as more machines and sensor equipment data needs to be captured and stored. The Internet of Things is creating trillions of files/objects and those objects need to be managed and stored. An object-based storage system is designed to do just that.
Lesson 3: Data protection
Another feature of object-based storage that should capture the attention of IT planners is the system’s ability to significantly enhance data resiliency. An object-based system understands data at the object (think file) level. That means if there is disk failure in the storage cluster, only the objects stored on the failed drive or node need to be restored. Additionally, recovery of that data can come from multiple nodes. This means a much faster rebuild time when compared to traditional RAID 5 or RAID 6 data protection schemes, which are typically composed of high-capacity hard drives that further contribute to the time it takes to rebuild a failed drive.
Lesson 4: Intelligent data placement
Cloud providers also leverage object storage to intelligently place data throughout their unstructured storage infrastructure. There are multiple implementations of this feature, but the most common is to ensure that data is placed as close to the user as possible. For example, if a cloud subscriber has four locations and one location suddenly starts using a particular data set, that information can be transferred to increase performance, while an alternate copy is maintained in another location. This allows the active location to experience local access performance to the data while maintaining disaster recovery (DR) status.
Intelligent data placement can also be used within one data center. Using similar logic, an object storage system can place data on a faster media type, such as solid-state drives, when it is being actively accessed and then moved to hard disk storage when it is not.
Of course, intelligent data placement can be leveraged for data protection and DR purposes. For example, it can make sure a set quantity of each object is stored in a user-defined number of data centers before it is considered protected. Even within a data center, it can ensure that local objects are protected locally and that redundant copies are not on the same node within the storage cluster, or even in the same rack row.
As unstructured data and long-term retention continues to increase, enterprises will be faced with the same storage challenges cloud providers have had to deal with. They will need to drive down the cost to store this information, which is often achieved through the use of commodity hardware. This requires an infrastructure that can intelligently respond to drive failures and the demands of a distributed workforce. A private cloud storage solution provides many of these needs and can add value to almost any sized data center, not just enterprises.
Examining the value ofAmazon Web Services vs private clouds
Step-by-step guide to building a private cloud