I recently reviewed a monthly AWS invoice for a 40-person fintech client that had jumped from $8,400 to $19,000 in ninety days. Nobody had shipped a major new feature. No client base had doubled. When I dug into the Cost Explorer, the answer was almost embarrassing: three staging environments running 24/7, a database instance sized for a load test that never happened, and forty unattached EBS volumes nobody remembered creating.

This is the conversation I have with SMB and mid-market clients more than almost any other lately: the cloud bill went from "reasonable" to "please explain this" in a matter of months, and nobody is quite sure why. Cloud cost optimization — the discipline the industry now calls FinOps — is not about migrating off the cloud or negotiating with your provider's sales rep once a year. It is an ongoing operating practice, and most companies never build it.

Industry data backs up what I see in client environments. Flexera's 2026 State of the Cloud report puts wasted cloud spend at roughly 29% of total IaaS/PaaS budgets industry-wide, and that number went up in the last year instead of down, largely because AI workloads make cost forecasting harder. Organizations without a FinOps practice waste closer to a third of what they spend; the mature ones bring that down to 15-20%. That gap, on a $250,000 annual cloud budget, is real money — $30,000 to $50,000 a year that could fund another engineer.

Here is how I actually fix it for clients, step by step.

Where the Money Actually Leaks

Before optimizing anything, I inventory where the waste concentrates. In almost every SMB environment I have audited, it clusters in the same five places:

  • Idle non-production environments. Staging, QA, and demo environments running 24/7 when they are only used during business hours.
  • Oversized instances. Compute provisioned for peak load "just in case," sitting at 8-12% CPU utilization most of the time.
  • Orphaned storage. Unattached volumes, forgotten snapshots, and old AMIs nobody has cleaned up since the last migration.
  • Unused reservations. Reserved capacity purchased for a workload that changed shape six months ago.
  • Data transfer surprises. Cross-region or cross-AZ traffic nobody accounted for in the original architecture.

None of these require heroic engineering to fix. They require someone actually looking.

Rightsizing: The Highest-Leverage Fix Most Teams Skip

Rightsizing is unglamorous, which is exactly why it gets skipped. It means comparing actual CPU, memory, and I/O utilization against what is provisioned, and matching the two.

My rule of thumb: any instance running below 40% average utilization over a two-week window is a rightsizing candidate. In practice, this single exercise typically recovers 20-30% of compute spend with zero architectural change. You are not rewriting anything — you are matching the size of the box to the size of the job.

The catch is that rightsizing is not a one-time project. Traffic patterns shift, features get added, and teams provision new resources without revisiting old ones. It needs to be a recurring quarterly review, not a New Year's resolution.

Commit to What You Actually Use — Carefully

Once your baseline usage is accurate (only after rightsizing, not before), commitment-based discounts are the next lever. AWS Savings Plans offer up to 72% off on-demand pricing, and Reserved Instances can reach 75% for workloads with genuinely stable, predictable footprints. Azure Reserved VM Instances and Committed Use Discounts on Google Cloud work on the same logic.

The decision between the two is not about which discount percentage is bigger — it is about how much your infrastructure changes. If your team rightsizes, redeploys, or shifts instance families more than once a quarter, which is true for most growing SMBs, the flexibility of Savings Plans across instance families and sizes usually beats the marginally deeper discount of Reserved Instances, because RIs lock you into specific instance attributes. I generally recommend covering 60-70% of a stable baseline with commitments and leaving the rest on-demand or spot for genuinely variable load.

This is also where I see the aftermath of migrations that skipped a cost strategy from day one. If you are still working through the decisions I cover in my guide to leading a cloud migration, build the commitment strategy into year one instead of retrofitting it after the first surprising invoice.

Storage Is Not "Set and Forget"

Storage costs creep quietly because nobody treats them as urgent, until the bill shows a line item bigger than compute. Concrete tactics that consistently move the needle:

  • Lifecycle policies. Move infrequently accessed data from S3 Standard to Infrequent Access or Glacier automatically after 30-90 days. Savings of 40-60% on archived data are typical.
  • Snapshot hygiene. Automate deletion of EBS snapshots and RDS backups past your actual retention requirement, not "just in case."
  • Delete orphaned volumes. Unattached EBS volumes and old AMIs are pure waste; a scheduled monthly sweep catches them before they accumulate.

I worked with a client running a legacy .NET application whose database backups alone accounted for 18% of their AWS bill. The same modernization project I describe in migrating a .NET application to AWS Elastic Beanstalk included a storage lifecycle cleanup that cut that line item by more than half in the first month.

Kill Idle Compute With Scheduling and Autoscaling

Non-production environments are the easiest win on this entire list, and the most commonly ignored. If your staging environment only needs to run Monday through Friday, 8 a.m. to 8 p.m., scheduling it to shut down outside those hours cuts its cost by roughly 65%, with essentially no engineering effort — a Lambda function or a native scheduler action does the job.

For production workloads with variable demand, autoscaling groups tied to real utilization metrics, not fixed instance counts, let you pay for capacity that matches actual traffic instead of your busiest hour of the month. Combined with rightsizing, this is usually the second-largest recoverable cost after storage and idle non-prod cleanup.

Making FinOps a Practice, Not a Project

The optimization work above will save money once. What keeps it saved is treating cost as an ongoing engineering responsibility, not an annual finance exercise. Three things I insist on with every client:

  • Tag everything. Cost allocation tags by team, environment, and project turn "the AWS bill is high" into "the QA environment for Project X costs $4,000 a month" — a problem someone can actually own.
  • Run a monthly cost review. Thirty minutes, one dashboard, the same three questions: what changed, what is idle, what is oversized.
  • Set budget alerts before the invoice, not after. Threshold-based alerts at 50%, 80%, and 100% of forecasted spend catch runaway costs while they are still small.

Where to Start This Week

If you only have time for one thing, run a rightsizing report this week. Every major cloud provider has a built-in tool — AWS Compute Optimizer, Azure Advisor, GCP Recommender — that will surface the worst offenders for free. It will not fix everything, but it usually pays for the rest of the FinOps work within the first month.

Cloud cost optimization is not a one-time cleanup. It is a discipline, and the SMBs that treat it that way consistently run leaner than competitors still paying for infrastructure they provisioned two product pivots ago.

Let's talk through your situation.