Working with OpenStack storage: Tips on Cinder, Swift and the cloud
A comprehensive collection of articles, videos and more, hand-picked by our editors
Moving to a private cloud infrastructure that enables authorized employees to self-provision storage, servers and networking resources is no small undertaking. Just consider the case of Latin American e-commerce specialist MercadoLibre Inc.
The Buenos Aires, Argentina-based company (which offers eBay-like services in 14 countries and counts eBay among its investors) spent the last year working on its own open source cloud storage project -- a private cloud storage and compute infrastructure using open source software (OSS) from the OpenStack community founded by Rackspace Hosting Inc. and NASA.
The project team might need another year or more to fine-tune and complete the transformation to the infrastructure-as-a-service (IaaS) model it hopes will enable faster delivery of IT resources and help the company’s developers update features and applications for its websites more quickly.
“The hardest thing for us was to change the way the whole company used to do their tasks; for example, to request a server, run an application or run [quality assurance] QA testing,” said Leandro Reox, a senior infrastructure engineer at MercadoLibre.
Read the entire series on building a private cloud with open source software
Understanding “Swift”: MercadoLibre’s OpenStack Object Storage implementation
Private cloud storage environments bring scalability and savings
By early last year, the IT infrastructure team had recognized that it simply couldn’t deliver servers fast enough to the company’s developers and internal clients. The scaling problems also extended to the NFS-based NetApp FAS6280 and FAS6080 filers.
Implementing a service-based private cloud produced a near-immediate impact. System administrators delivered approximately 2,000 virtual machines (VMs) in the 18 months prior to implementing the private cloud. Since making available self-deployment options last August, the infrastructure team watched the VM count grow to more than 6,000 VMs, according to Reox.
But VM delivery is only one piece of the puzzle. When it implements its open source cloud storage project, MercadoLibre wants to make every part of its infrastructure, including storage systems and databases, available as a service through private and public cloud resources. The vision also extends to enabling applications, or at least their front ends, to run on public clouds, such as those operated by Amazon.com or Rackspace.
“Our business can grow like a monster just for a marketing campaign, so we need to be prepared to scale up automatically, and that [application architecture] change gave that to us -- the ability to scale faster in a stable way,” Reox said.
The new approach had consequences for the storage infrastructure. To compensate for the network-attached storage (NAS) and network file system (NFS) scaling limitations, the project team decided to implement more scalable object storage for the customer-supplied product images for its websites and other static information. They also planned on essentially receiving automatic backups via the redundant object copies in the OpenStack system.
Reox said MercadoLibre converted its high-end NetApp FAS6280s and FAS6080s from file to block storage for the sake of speed and reliability for its major databases. The team purchased less expensive NetApp FAS3270s for block storage of VMs and MySQL databases. Developers can write batch jobs to transfer any data they need from the NetApp filers to the OpenStack Object Storage.
To enable applications to run in a public cloud, developers will need to decouple them from the NAS systems they use for data access. That will mean rewriting portions of the code to enable the applications to access data through API calls to the object storage system.
So far, MercadoLibre has tested only a limited number of front-end Web and application servers with Amazon’s public compute cloud. Developers will address the recoding work during the coming months, according to Reox.
Under the new model, the front-end Web servers that present the pages a visitor views could run on a public cloud, but they would access any data they need through an external API published on the Internet via a URL. That URL points to the private cloud where the VMs run and the data is stored.
“We can retrieve info on any spot on the planet with just an HTTP API call,” Reox said.
While the advantages may be substantial, the road to get there has had its rough spots. For instance, documentation was so poor with the early releases of OpenStack that MercadoLibre’s project team had to dive into the code to develop a custom API to load balance the OpenStack server clusters.
Reox said the documentation has improved; an OpenStack community project is updating the documentation. Unfortunately, the improvements didn’t come soon enough for some users.
OpenStack proves challenging
Marc Staimer, president of Dragon Slayer Consulting in Beaverton, Ore., said he knows of a financial services firm that pulled the plug on OpenStack after four months, while another company, focused on media and entertainment, didn't like the file size limitation of OpenStack.
You’re going to feel that every block of storage is actually being used more efficiently and is going to be more broadly available to everyone. The payload at the end of the road is more than worth it.
Alejandro Comisario, senior infrastructure engineer, MercadoLibre
“They think, ‘We can do this for free.’ Then they get their hands on it and [become] very disillusioned,” Staimer said. “It’s very difficult to implement OpenStack. From everybody I’ve talked to, you need some pretty talented people to make it work effectively.”
MercadoLibre commenced its private cloud work with four former systems administrators/IT infrastructure staffers and currently has five working on the project. OpenStack held appeal for them, given their staunch support of, and contributions to, open source software.
“We love open source,” Reox said.
Alejandro Comisario, a senior infrastructure engineer at MercadoLibre, said storage administrators should prepare for significant change in their work, and potentially even pick up some programming skills, as they start to think about scaling storage in new ways.
“There’s a lot of work, but it’s actually very fun,” Comisario said. “And you’re going to feel that every block of storage is actually being used more efficiently and is going to be more broadly available to everyone. The payload at the end of the road is more than worth it.”
So far, MercadoLibre’s project team has implemented five components of the OpenStack software platform: “Nova” Compute, “Nova” Volume block storage, “Swift” Object Storage, “Glance” Image Service and “Keystone” Identity Service. (The names in quotation marks represent code names.)
In July 2011, the team started to work on the Nova Compute software to provision and manage the company’s VMs, which run on open source XenServer, as well as the Nova Volume software that enables persistent block storage of the VMs. Both became available to internal clients for self-provisioning in August, according to Reox.
OpenStack Object Storage -- which the project team made available to developers in December -- makes use of clusters of inexpensive, commodity servers to store petabytes of generally static data.
“MercadoLibre follows the eBay model and has a large number of customer-uploaded image files that are fairly ephemeral. That’s a perfect case for Swift. Object stores are designed for huge numbers of relatively small files,” said Beth Cohen, a senior cloud architect at Boston-based Cloud Technology Partners Inc. (cloudTP), which partners with Rackspace to help companies implement open source cloud solutions, like the OpenStack-based Rackspace Cloud: Private Edition.
The OpenStack Glance Image Service stores the VM images that MercadoLibre has defined. Developers review the available images and choose the most appropriate, such as a Red Hat Linux image for a MySQL database, or an Ubuntu image for an Apache Tomcat server.
More on open source cloud storage
Open source cloud projects: Amazon's EC2
Frequently asked questions about open source cloud computing
MercadoLibre also spent a couple of days at the end of last year implementing the Keystone Identity Service, which handles authentication and user permissions for access to resources and services. For instance, a user might be permitted to access the object storage service but not create a virtual server instance.
Comisario said MercadoLibre has experienced no major outages since launching OpenStack. But he knows the company would need to react quickly if the identity service or proxy servers went down and cut off access to the storage.
“You know that it’s going to fail” at some point, he said. “You have to recover as fast as you can.”
The 2012 plan calls for MercadoLibre to go into full production with the Swift object store after two dozen Hewlett-Packard Co. servers arrive. The project team also plans to virtualize the network layer using the OpenStack Quantum Network Manager and Melange IP address management.
MercadoLibre currently uses Fibre Channel to connect its NetApp appliances to core switches, as well its edge switches to core switches. It also has 10 Gigabit Ethernet between its database servers and switches and 2 Gbps elsewhere in dynamic link aggregation mode.
“Maybe at the end of 2013 we’re going to be exactly where we want to be with our private cloud. We're working really fast and hard to build it,” Comisario said. “But we’re really happy with the results today.”