A Primer on Commitment Discount Management


Whether small startup or large enterprise, you need to be able to scale managing commitment discounts. Tools to automate this part of Rate Optimization exist but they can be pricey or potentially untrusted because this tool can easily spend a lot of money on your behalf.

This post will focus on AWS.

There are many strategic decisions that need to be discussed that impact your ability to scale, help with your time needed to act, and to be successful.

First thing first: laying out a foundation for success and the ability to scale managing Commitment Discounts. Reserved Instances and Compute Savings Plans are Commitment Discounts: you are making a commitment (contract) to spend money on specific Services or Resources for a period of time. I’ll use the term Commitment Discount when generically referring to both of these throughout this post.

Having strong allocation before endeavoring down this path will help significantly, especially when figuring out which Engineering teams to engage with.

To assist with the analysis with Engineering I prefer to get direct input from the largest of teams and workloads, following an 80/20 rule: 80% of the costs, and thus Commitment Discounts, are generally covered by 20% of the teams. So, target largest workloads first but make sure to keep track of them.

Inventory

Whenever you make a purchase, log this in an Inventory. A simple spreadsheet is more than adequate to manage even at scale (you can then easily create a formula to create a script to purchase reserved instances purchases from this).

Always keep track of the metadata of the purchase; minimally purchase date, expiration date, term, purchasing account, reservation ARN, service, region, no/partial/all upfront,instance type, count, upfront cost, and WHO and WHY you bought the Commitment Discount. It is crucial you keep track of who and why you made the purchase so that you can explain what this is/was for and easily follow-up in the future during expirations. This data will be very useful when performing the FinOps Operations Review.

Strategy: Buying Location

On which account do you make the purchase? Payer (AWS Organization) or Linked account

Most recommend the Payer account because a Commitment Discount bought on the Payer account can be applied to any Linked account. If the Commitment Discount goes unused on one Linked account, it will automatically start to be used on another Linked account if the Commitment Discount details (instance type, region, etc.) match.

Strategy: Cashflow

How much to buy and when? Buying all your Commitment Discounts at 1 time can cause a significant impact to cashflow right now and when they expire.

  • Does your company have significant coffers of cash and/or should you consider spreading out the purchases over a time period?
  • Do you have the budget to make the purchase? Allocating budget each month or quarter to purchase Commitment Discounts is recommended. Be proactive.
  • Do you have approval to make the purchase? Whether you have the budget or not, are you authorized to make the purchase by Finance, CFO, your leadership, etc.? Often you will need to confirm with Finance due to cashflow implications before purchasing.

Strategy: Target Coverage %

Think about: What percentage of workloads do you want to target to cover with Commitment Discounts?

100% can easily lead to being over committed and results in wasting money. And, what happens if there are optimizations that reduce service costs?

90% is fairly aggressive but works for mature organizations and leaves room for optimizations.

Consider aiming for 60-70% to start.

BUT all of this depends as well: if you are spending $50/month on a Service, does it make sense to spend your time and Engineering’s time to review, discuss, and purchase a commitment discount? Spending dollars chasing cents is not wise.

Strategy: Reserved Instances and/or Compute Savings Plans

Educate yourself on the differences between Reserved Instances and Compute Savings Plans and know that Reserved Instances are going away.

Compute Savings Plans flexibility can be significantly better for Engineering even though Reserved Instances have better discounts.

Which is more important: Flexibility or saving more money?

Strategy: Instance Class

Educate yourself on Standard and Convertible RI’s.

Most companies leverage Standard but there are use cases for Convertible (seeding throughout the year and buying when it makes sense to cover temporary workloads) but this will add complexity to managing Commitment Discounts in your Inventory.

Does the flexibility of Convertibles outweigh the higher discount of Standard?

Strategy: Instance Family

A strategy of recommending a specific instance family can be promising because within EC2, RDS, and ElastiCache (NOT Opensearch) an RI can be automatically transferrable to different sizes within the same instance family (r7g.large <> r7g.medium).

Recommending instance families like M8g or R7g provide additional benefits in that these are graviton based instances which are cheaper and more performant, but require running your code via ARM vs. x86 (Intel/AMD). On services like RDS and ElastiCache where you are not running compiled code but instead leveraging a platform service, it is strongly recommended to discuss (with Engineering) setting an Instance Family strategy.

Ultimately: Does it make sense to restrict Engineering to specific instance families?

Strategy: Buy Based on History & Forecast Usage

Do you have hundreds of Engineering teams and/or 10’s of 1000’s of resources? Working through all of these can be challenging.

You have an option to make a trade-off between positively confirming workload requirements with Engineering OR purchasing Commitment Discounts based on historical usage and forecasts.

Note: This item is less riskier to do with Savings Plans due to their flexibility. Note: With this approach you risk buying Commitment Discounts for workloads you have not confirmed with Engineering.

  • Review workload cost and usage history and gain additional details on workloads by reaching out to the Engineering teams (largest workloads) and seek to understand their historical and future usage plans.
  • Discuss option with the Engineering teams that would align well with their forecasts.
  • Consider buying Commitment Discounts to cover some percentage of the workloads: more if you have a better forecast from Engineering (workloads will remain the same) and less if a less accurate forecast OR Engineering expects to optimize and reduce usage.

If workloads are unpredictable or you are lacking sufficient Unit Economic details then do not consider buying large numbers of Commitment Discounts. Consider buying a minimal number of Commitment Discounts such that they are being used 100%.

Return on Investment (ROI)

With each Commitment Discount there is a point in time where the cost of the Commitment Discount overcomes the same cost of the workload as if it were on-demand: this is called the break even point (approximately month 7 for 1 year RI’s and month 13 for 3 year RI’s but this changes depending on the Service, cost, and other details of the potential RI).

The below graph showcases a simple example of the break even point concept using a $50/month service. The break even point is where the red (1 YR RI) or orange (3 YR RI) cross $0.

The remaining cost you would have paid between the break even point and the expiration of the Commitment Discount is effectively free, so running workloads on a Commitment Discount to the expiration is key for maximizing financial discounts.

The break even point is also the point where you have hit your ROI and you could make a change to a new Commitment Discount without negative financial implications, except if your forecast includes the free period after the break even point.

Expectations

When you purchase a commitment discount you are entering into a 1 year or 3 year contract. This plus the other restrictions (account, instance type, region, number of instances, etc.) are key bits of information to bring to Engineering.

If you make a mistake and buy a commitment discount, you can submit a billing case and request it to be canceled.

Gotchas

DynamoDB has some gotchas: Reservations requires using Provisioned mode and do not apply to global tables.

Unit Economics

Do you have data on unit economics to support workload history and forecasts? e.g. the number of units that causes the workloads to scale up and down.

Are these units trending up or down? Are they seasonal? Do they actually impact workload costs?

Data Analysis

Compare costs that are on-demand vs. reserved (covered by Commitment Discounts). Do this for each Service you are looking to buy Commitment Discounts for.

Visualize this within Cost Explorer using the Purchase Option dimension.

You should routinely review as part of your FinOps Operational Reviews to determine if on-demand is growing (you may need a new commitment discount or a commitment discount expired).

Alignment with Engineering

This will help drive ownership across Engineering!

Before engaging with Engineering, verify you have firm or potential budget and approval to purchase your Commitment Discount recommendations from Finance/budget owner.

Prepare the data and details:

  • Expectations and limitations on Reserved Instances and Compute Savings Plans: term, instance family/flexibility, region, costs, flexibility, ROI, etc.
  • Ensure you explain that these are commitments and if infrastructure changes are made prior to the break even point in the ROI calculation then it will waste the company’s money

Depending on your company’s culture, you may have to present a business case with Finance/Budget approver and it may make sense to have these folks at the same meeting with Engineering.

Engage in an open conversation with Engineering and discuss plans for the workload:

  • Review the cost and usage data together
  • Ask questions to gain an understanding of the workload’s purpose. Stress that you are here to learn and gain alignment with them before diving deeper:
    • Is the workload static or stable? Why or why not?
    • For how long will they keep this workload like this?
    • Do any external factors raise or lower costs, like number of customers?
    • Ask this again: “Realistically, how long will this workload be configured like this?” If the time period is greater than the break even point then it makes sense to buy the Commitment Discount.
  • Gain alignment on purchasing the Commitment Discount(s)
    • Be clear that if they are planning to change workloads before the break even point they need to engage with you to discuss options (will this be used by another linked account in the organization, will they need more because they are scaling up, etc.)
  • Always leave this meeting with a reminder that they should proactively engage with you to purchase Commitment Discounts to cover long term workloads so that you both can be good stewards of the company’s money.
  • Always log the details in the Inventory and confirm budget and approval before making the purchase
  • Communicate when the purchase has been made and show them the cost impact after you have updated data

FinOps Operational Reviews

Define the routine and use a consistent process for all Commitment Discounts:

  • Timeline: Quarterly? Monthly? Biweekly? This will depend on how much workloads change. Monthly is a good spot to start.
  • Review the Inventory for expiring Commitment Discounts
    • If your Engineering teams are well organized, you could be proactive and add JIRA tickets to their future roadmap, 30 or 60 days prior to expiration, such that it would trigger a conversation with you.
  • Schedule weekly reports to catch anomalies and changes in coverage
    • Identify new on-demand workloads in this reporting
    • Engage with Engineering on new workloads
    • Purchase as needed
  • Review commitment discount usage to ensure they are being used 100%; reach out to the team(s) from the original purchase. View unused commitments via AWS Cost Explorer or CUDOS (especially if you have multiple AWS organizations). CUDOS (Quicksight) can be configured to email you a report.
  • Consider building a KPI like Effective Savings Rate to gauge how well you are doing with managing Commitment Discounts

Are there Engineering Operations Reviews, Architecture Review Boards, Change Management Boards, or other calls you can be a fly on the wall to understand upcoming workloads or changes in workloads?


Discuss this post on LinkedIn