Made popular by startups that wanted to reduce their upfront infrastructure costs, the public cloud offered an easy opex method to get development efforts off the ground. Public cloud providers now allow enterprises of all types to focus on the devops practices that thrive in that environment. But the public cloud is not worry-free, especially from a cost perspective.
Some factors, such as bandwidth, are well known and typically included in business models. But there continue to be stories about public cloud budget blowouts. If you’re in a group that needs to reduce the cost structure of your devops efforts, read on and see if you’re overlooking any of these five tips:
Turn off the VMs
The public cloud operates in principle on a pay-per-use model, but one reason cost overruns are common is a simple failure to shut down VMs. The question is when to turn off the lights.
Among devops scenarios, only staging environments, which replicate production, might need to run 24/7. But that is only for limited periods of time. As for developers, they may be scattered across various time zones and work long, odd hours, but even development and testing environments are unlikely to require a full 168 hours a week.
We all know that usual work hours are anywhere from 40 to 50 hours per week. A simple disciplined approach to turning off compute resources when not in use, perhaps in an automated way, could save 60 percent or more of resources for which you may be otherwise charged.
Here’s another area, from the production world, where automation can help. Many businesses—and applications—have peaks and troughs. A retailer may do 30 to 40 percent of its sales in the holiday season. Health insurance has its busy enrollment season. Payroll services have a bimonthly traffic pattern. A ticket site will peak soon after concert sales go live.
Building your environment for those peaks, however, will result in a tremendous amount of wasted capacity and unnecessary costs. What autoscaling can do is accommodate these ebbs and flows, adding servers when you need them and turning them down when the busy cycle passes. You can program for known patterns or use triggers to activate scaling, and then keep scaling until the symptoms go away.
Mind your GETs
The key question about data storage is how you use it. If your workflow needs primary storage with high I/O requirements, alongside compute to keep apps running, you may not want it on the public cloud in the first place. For secondary and archival storage, the pricing on the public cloud is indeed low, but what can add up are the GET requests used to access your data.
The PUT requests that move terabytes of postprocessed data into cold storage servers at pennies per month differ from the GETs, which let you extract or download your data on a cost-per-thousand basis. If you are going to need regular downloads or anticipate moving your data somewhere else, you should expect to incur more costs. Of course, it’s best to answer these design questions upfront.
Sprawl is associated with shadow IT, which now includes public cloud VM infrastructures. The situation is easy to understand: Developers need more resources to get their work done; adding VMs is quick and easy. But if there is no ongoing reporting or awareness, whomever gets the invoice may be in for a surprise, especially if there are twice as many VMs up and running as were budgeted. Plus, these resources may be forgotten and not shut down when not needed.
A lack of oversight can also lead to inefficient pricing, and when disparate technologies are involved, you can end up with complexity that is costly to manage. The solution is a system of checks and balances that minimizes speed bumps while maintaining control and governance.
Avoid security gaps
Organizations typically establish security rules and policies at two tiers:
The enterprise or corporate level, where policy is coded into technologies for consistent application across a wide base of resource users.
The departmental or specific application level, where access is governed on an HR or business-unit basis according to roles.
But how consistently are these policies enforced? When you create a virtual local area network (VLAN) to support a set of newly spunup VMs, are you copying over the full set of applicable policies? If not, you’re exposing your enterprise to external rogue elements that can quickly identify network security gaps, breach your defenses, and increase the costs (in this case, indirect) of your environment.
Summary: Don’t forget opex
The appeal of the public cloud is how it lets organizations shift from heavy upfront capex to less onerous opex and focus on value-added efforts, such as devops initiatives. But while providers of public cloud infrastructure handle the heavy lifting and infrastructure, opex doesn’t manage itself.
To contain devops costs, be realistic when budgeting. But in addition: Use automation to turn off VMs when they’re not in use and to scale when needed, choose the right kind of data storage, and find effective ways to manage cloud resource growth and to enforce consistent security policies.