Shutterstock VP on using OpenStack to build your own private cloud

Tools

You've probably heard about OpenStack, the open source project specifically designed to help organizations roll out their own massively scalable cloud infrastructures. To better understand how OpenStack can benefit enterprises, we approached Chris Fischer, the VP of technology operations at Shutterstock.

A well-recognized stock photography agency, Shutterstock leverages OpenStack to drive its infrastructure that is used to host the millions of stock photos, videos and illustrations. Fischer outlines how enterprises can benefit from OpenStack, and shares his experience in deploying OpenStack and highly scalable systems in general. Below are his responses, which have been edited for length.

FCIO: Can you briefly outline what OpenStack is?

OpenStack is an open source cloud platform that enables you to build and manage your own private cloud installation. The advantages of running your own private cloud are numerous, ranging from substantial cost savings to tighter control over hardware resources or performance characteristics. If you use OpenStack to build your own infrastructure as a service (IaaS) platform you can control disk, memory and CPU profiles that match your application's unique needs, giving you predictable cost, performance and fine-tuned control over your infrastructure that can be difficult to procure from public cloud providers.

FCIO: Why should enterprises be interested in OpenStack?

As the Internet and enterprise ecosystems continually become more software defined, high-performing organizations should keep their eyes on technologies that provide either opportunities to streamline operations and minimize cost, or ones that enable faster product development and elastic access to resources. 

OpenStack offers a flexible API that developers can interact with directly to provision resources and nodes as needed. This reduces development time and moves your ops team into a role of "tool development" so they can focus their time building systems that add value to our organization. In the early days, all these capabilities had to be developed in house, now OpenStack can do a lot of the heavy lifting. 

From a pure capital management perspective, private clouds can be created that offer significant cost advantages in both OPEX and CAPEX relative to current public cloud offerings. Public clouds still come at a pretty steep premium. By managing your resource utilization effectively using just-in-time resources, you can create massive savings.

OpenStack also provides a unique view into your corporate resources using quotas. It's trivial to assign specific quotas to groups or business units inside your company and essentially track usage costs, and assign them to the appropriate vertical.

FCIO: How can enterprises start deploying OpenStack in their organizations?

Deploying OpenStack is a business commitment. First and foremost, moving your enterprise operations to an IaaS platform is a decision that requires organization wide buy-in and support. All resources in OpenStack will be highly virtualized and development teams will need to be on-board with a VM based environment. I'd also recommend a high degree of emphasis on standardizing images and build types as you can quickly create a tough environment to manage if your technology stack grows unchecked. 

Managing the cluster has a learning curve, you'll want a team of engineers who are familiar with (or can learn) Nova, Swift and the OpenStack cluster management in general. It also doesn't hurt if they can write some Python and contribute to the source code. You definitely want a team of folks who are familiar with open source development, know how to file bugs and use channels like forums and IRC to get support. The source code for OpenStack is available online and can be downloaded and installed on commodity hardware systems, and comes in packaged formats for several well-supported operating systems. 

One thing to note when choosing disk/network vendors is that some have offerings that integrate into the cluster allowing their hardware to be software-defined. Arista/Juniper are good network choices depending on your approach and needs for network. For disk we use Coraid to provide all our shared storage as they have some products that tie in directly with our OpenStack deployment making persistent disk pretty transparent. Lastly, there are many OpenStack integrators out in the wild and consulting services are available if needed. Look to companies like Piston or SwiftStack if you need specific domain expertise.

FCIO: How does Shutterstock leverage OpenStack to build its infrastructure?

Shutterstock's strategy with OpenStack has multiple phases. In phase one, OpenStack was deployed to replace a hybrid cloud and hardware environment for all development and QA resources. This enabled large levels of cost savings and flexibility, as well as a great interface into our development and QA resources. The most useful function was that developers could deploy their nodes and resources with little operations interaction, which is pretty key as we have hundreds of VMs in our DEV/QA stack. With developers able to take care of the basics, operations spent time building systems for configuration management and monitoring instead of setting up VMs. 

In the second (and ongoing phase) OpenStack is being deployed more widely to replace our current virtualization platform. The current platform is KVM and LibVirt with custom applications to manage provisioning and deployment of our nodes. Because OpenStack is replacing the virtualization layer, we are able to reuse all our configuration management code (puppet), as well as our orchestration software. The transition involves deploying new VMs and moving services from the old stack to the new. The scale of our production nodes is in the multiples of data centers and thousands of nodes.

FCIO: Can you highlight some shortcomings in OpenStack as it stands today?

OpenStack isn't without flaws. In initial releases, integration with central authentication systems (LDAP) was clunky at best. In addition, the networking stack and integration with networking vendors was lacking and only now is this starting to improve. In any cloud environment load balancing and network services are key, and without a network your fleet won't do much. OpenStack's features in this area are lacking, but can be managed via designing your system with this in mind from the beginning using open source or appliance based load balancing. Similarly, OpenStack doesn't have built in PaaS solutions (like Amazon's Redshift, load balancing, or RDS) so for organizations that need a fully-featured platform based solution, OpenStack doesn't have all the answers right now.

FCIO: Any other comments about highly scalable systems in general?

Build for failure. Don't expect systems, nodes and hardware to live forever. If systems can't disappear without your application degrading, challenge that and make alterations to your code to address it and provide a consistent user experience even if the infrastructure isn't operating at 100 percent.

Always scale out if possible, make simple design choices and try to minimize complexity. The most robust and elegant solutions are often the dead simple ones. If you can't explain a system and draw it on a white board, you might be over complicating it. Focus on what's core to your business and make those systems bombproof. It's very easy to get lost focusing on edge cases or inappropriately classifying core work as technical debt. Hyper-focusing on what must work in your application is a great exercise. Always improve those areas of your system key to your business; it's not debt, its engineering excellence.

Finally, remember that people are the most important part of building a successful technology system. Keep your people focused on improving your core capabilities, performance, product development, etc. Automate away everything else and make the best use of your scarcest resource, your engineers.

More TechWatch one-on-one interviews:
"The biggest security issue I see right now": Tenable Networks CEO Ron Gula
How to avoid 3 costly disaster recovery mistakes: IBM Distinguished Engineer Richard Coccihiara
What to make of the cloud and BYOD: BMC CIO Mark Settle