Continuous Cloud Cost Control with FinOps | Cloud industry forum

Continuous Cloud Cost Control with FinOps

By Richard Stinton, Senior Cloud Economics Consultant at Source Code Control Limited

The move to Public Cloud computing has necessitated changes in the way different parts of the business interact with each other, particularly around procurement, operations, and financial management.

Previously, in an on-premises environment, business units would request services from IT, usually as part of a larger demand/capacity management process. This process would, in turn, request budget and then involve procurement to acquire IT resources. These resources would typically last anything between three and six years and would be financially written off over that period as a capital expense (CapEx).

Nowadays, with Public Cloud, the DevOps teams, working on behalf of the business units, will deploy services on-demand in the Public Cloud, with little reference to procurement. Decisions of what services to deploy, how large or small they are, and how long they are needed for has been decentralised.  The demand and capacity management phase has largely disappeared. These purchases will be regarded as operational expense (OpEx). This change in process has driven the need for a new terminology around Financial Operations, now referred to as FinOps. In simple terms, FinOps brings financial accountability to a new model of variable spend.

Successful FinOps requires close collaboration between the teams shown.

What are the core principles of FinOps?

  • Teams need to collaborate closely on a daily basis as cloud services operate on a per-resource, per-second basis. Wrong purchase decisions can quickly add up.
    • Teams need to work together to constantly identify opportunities for innovation and efficiency
  • Everyone needs to be accountable for their use of Cloud services
    • Accountability is pushed out to the teams using the cloud services
    • Decision making about cloud usage and optimisation is democratised
  • Reporting of FinOps needs to accessible, meaningful, and timely
    • Need daily dashboards and reports as soon as data is available
    • Fast feedback loops result in the most efficient behaviour
    • Present data so that it is meaningful to the business and not just a technical audience
  • The need for a small, centralised team to manage FinOps
    • Maximise use of automation to reduce duplication of effort
    • Take advantage of macro-level rate and discount optimisation
    • Identify and leverage best practices

When should you build a FinOps practice?

Over the last forty years or so, organisations large and small have managed their IT infrastructure based on IT Service Management principles such as ITIL.  A subset of this was the concept of demand and capacity management to estimate and manage the IT requirement within the business, and of course, the ongoing management of that.

With the advent of Public Cloud, and the ability for developers to provision services using a credit card as payment, has led to the concept of ‘Shadow IT’ where the IT teams (and procurement) had little or no control over the purchasing process of cloud services.  For a while this led to a ‘Wild West’ approach to cloud deployments. Organisations would find a massive growth in uncontrolled and often unexplained cost, lack of visibility into where this cost was coming from, and what aspects of the business were being impacted. There was also a lack of predictability usually present in the old CapEx model.

FinOps allows organisations to gain control over their cloud implementations. However, FinOps is not an instant fix, it takes time to bed into an organisation, and is a cyclical, continuous improvement process. Importantly, it is not ‘ITIL for Cloud’, and there are many aspects of ITIL that will not work with the Cloud methodology.

Any organisation working with Public Cloud in any meaningful way should be investigating the benefits of deploying FinOps.

Getting started with FinOps

As already discussed, FinOps is a continuous process, aiming to improve all the time.  For organisations consuming cloud services, it’s never too early to start adopting the process.

To develop a successful FinOps implementation, the following lifecycle is adopted.

 

The Inform-Optimise-Operate process is cyclical, starting small and getting larger in activity. This is referred to as Crawl, Walk, Run.

Where do you start?

Unlike the procurement of traditional on-premises infrastructure, the line items of a cloud invoice or bill can easily run into thousands, if not tens of thousands of lines. As cloud services can be consumed and charged for by the minute or even second, the bill becomes more like a telephone bill, where short-lived services are rated in the bill as many individual line items and can then be summarised at the end of each month.

It's important to understand whether the service you deploy is the correct size for what you need.  For any given service, particularly IaaS virtual machines, there can be many, many choices. How much CPUs, RAM and storage do we need, and at what quality of service?  Once deployed, are you making the best use of these resources? The same is true of PaaS services where you might have the ‘Marshall amplifier’ set of knobs to define the service performance and quality. Do you always want everything set to 11? With PaaS services, it is often possible to ‘dial back’ your service quality, or even turn it off, when not required at full power, or at all.

In the Inform phase, it is important to analyse usage from both a Top-Down (monthly bill) and Bottom-Up (hourly/daily consumption) perspective.  Every month, your cloud provider will publish a bill of consumed services, usually as both a heavily summarised PDF file, but also as a very detailed, line-item bill (CSV) running to tens of thousands of lines. For larger consumers, these bills are too large to load into Excel, and so other solutions are required.

Clearly, you could be waiting a month for each bill, and you would be blind to your usage during the preceding 30 days. Fortunately, most cloud providers will also publish a consumption API through which you can get a constant feed on what you are consuming on an hourly basis.

With this in mind, it is then possible to have a ‘bottom-up’ running total of what is going on throughout the month, which can then be reconciled with the ‘top-down’ view created by the official bill at the end of the month.

Over time, it will be possible to build up a good understanding of how the business is consuming cloud services, how different services are growing (or shrinking), and enable setting and management of budgets.

While cloud services will be ordered into logical groups, such as region of deployment, service category, account or subscription, it is also important from a governance perspective to make full use of tagging and naming conventions. This enables organisations to add useful classifications to service usage such as:

  • Business Unit/Department
  • Criticality (Low, Medium, Business Critical)
  • Owner
  • Application name
  • Stage in SDLC (Test/Dev, Staging, Pre-Prod, Prod)
  • Data classification (Public, General, Confidential, Highly Confidential)
  • Support classification

As part of this, it is also important to identify shared resources that will need to be re-allocated in a proportional way.

What next?

Once you have a good handle of what you are running, what it is costing and which parts of the business it is benefitting, it’s now time to start to optimise things.

As discussed earlier, we now need to think about:

  • Are we using all the resources provisioned all the time?
  • Should we be changing the size/shape of a service occasionally or permanently?
  • Are there commercial offers for long term commitment to usage?
  • Are there wasted or unused resources?
  • Are there options to turn things off when not needed?

For most of these questions, it will be necessary to start to collect performance metrics either via a cloud providers API, or via the use of agents which can be surfaced via an API. Or do the cloud providers offer access to this information through their portals?

Alternatively, it may be worthwhile looking at the various ISV solutions that focus on cross-cloud cost optimisation, cross-cloud billing analysis and other areas appropriate to FinOps.

As with all software acquisitions, it can take time to become trained in all the features and functions of a solution, but this may prove more efficient than building your own solution. There are also several consultancies providing FinOps as a Service, which can be consumed on a month-by-month basis and enable organisations to benefit from staff who are already FinOps certified and have experience in suitable tooling solutions.

Summary

This article just scratches the surface of FinOps and highlights some of the quick wins to get a better grasp of an organisation’s cloud bill and the benefits of cloud computing to the business. The FinOps Foundation, a program of The Linux Foundation, provides more resources at https://www.finops.org/introduction/what-is-finops/

It is also highly recommended to read Cloud Finops by J.R. Storment and Mike Fuller, published by O’Reilly.