When deciding whether to implement an object storage platform into your environment, it makes sense to first outline what kind of data you're storing and how it's typically used. I would suggest you start by answering the following questions:
- Is it time to develop a genuine content store? In other words, do I need to keep a lot of data online for compliance or other regulatory reasons or for its historical value?
- Am I dealing with massive amounts of data that is overwhelming my file shares? Does this data become inactive shortly after it is created but stays on primary storage for long periods of time? When I need it, does it need high-speed access or would slower access be OK?
- Do I use backups as my archives today? Is that causing backup window issues? Are most of my applications disaster recovery (DR) protected with geographic separation? If not, do I want them to be? Do I have a large amount of content that is born static, such as photos or videos, that needs to be kept online for extended periods of time? Does my company have a desire to run serious analytics on this data today or tomorrow?
If "yes" is the answer to most of these questions, then you need to seriously look into object storage. Given the characteristics of object stores, it is easy to see why the use cases that have surfaced to the top include content stores, long-term archives, the back end of backup applications, backups with geographic separation for DR purposes and Web 2.0 applications. Web 2.0 apps had the distinct advantage of being written for object storage from the get-go. But then a large majority of these were written by the likes of Facebook, Twitter, Google, eBay and others for their own use and they developed their own object storage architectures that are not available to the outside world. Fortunately, however, today there are many vendors that specialize in object storage targeted directly at enterprises.
Object storage platform options: Types, vendors that provide them
You essentially have four choices:
- Purchase a fully functional object storage platform from a variety of vendors today.
- Purchase object storage software and install it on hardware (servers and storage) of your choice.
- Install software on select file and block storage arrays that adds an object interface to existing storage.
- Use a gateway solution that interfaces your existing application to a public cloud.
Fully functional object storage is available from EMC (Atmos-based), Cleversafe, Compuverde, DDN, Dell (DX, based on Caringo), HP StoreAll, NetApp StorageGrid, Quantum Lattus (Amplidata OEM), Scality, Tarmin and others. Object storage software (or a virtual machine version) is available from most of these vendors as well. Open source software (OpenStack Swift, Ceph, Gluster) is available for free downloading in a typically open source, unsupported manner. But it is also available in a commercial, fully supported version, from the likes of Inktank (Ceph) and Red Hat Storage.
Software in option 3 above is mostly available from major players such as EMC Isilon or ViPR, HDS and HP 3PAR. The fourth category is interesting in that many backup and archival platform vendors have modified their software such that the back end can be a public cloud, such as Amazon Web Services or Microsoft Azure. In this case, you essentially get the benefits of an object storage platform without having to build a system yourself. It may be the best way to get in the game, as you learn more about the capabilities and limitations of object storage.
A large number of small backup and DR players, too numerous to list here, have sprung up in the past three years that specialize in these areas and use the public cloud as the repository. Of the major players, Riverbed offers Whitewater appliances that provide deduplicated backup data on-premises for immediate restores and uses the public cloud back end for storing older backups and to enable DR in the cloud or at a third site.
Symantec also offers a way for the backups to be stored in public clouds. Microsoft, via its acquisition of StorSimple, has an appliance that sits in the data center and presents an iSCSI interface to the application while optimizing the data, which includes performing all the protocol conversions for Microsoft Azure. There are no changes necessary to the application in this scenario.
If you choose option 1 or 2, the work involved is nontrivial. You will need to decide which applications will run on object storage and how these applications will be modified to make REST-based calls to object storage. If you have no control over changing the source code for these applications, your options are then limited to using a gateway. However, if you do have the ability to modify these applications, then go ahead and survey the object storage products in the market to see which ones make sense for you.
But my suggestion is to start playing with the public cloud first before embarking on building a large private cloud. Backup applications lend themselves readily to such ideas. Perhaps start with those. In that process you will end up getting offsite DR for "free." Then consider large content stores, perhaps initially using a gateway so that no change to the application is required. When you do modify the application you will be able to use the metadata "magic" that object storage enables. And that would open up all kinds of opportunities for analysis that you never dreamt were possible before.
Of course, if you are developing a Web 2.0 application yourself, I say you dive right into implementing full-blown object storage right from the start.
Object storage can no longer be ignored
It is time to start seriously exploring object storage if you haven’t already. Exactly what you select and how you go about implementing it varies. I suggest you let the use case determine the best method and start small. If possible, learn the idiosyncrasies of object storage before you make the big strategic decision that will last you for five years or more.
Another alternative is to survey the unified offerings, mostly from the larger legacy vendors, where you get all three methods of access (file, block and object) and don’t have to worry about building a separate object storage box. Keep in mind, however, that such a unified system ultimately has either file or object underpinnings and will perform, cost and scale accordingly. But for convenience it is unbeatable.
This was first published in January 2014