Why Skipping Automated Tests Is the Most Expensive Shortcut in Software Development

The $80,000 Typo

Last year, a client of mine — a fintech company processing recurring subscription billing — pushed a "small" update to their pricing logic on a Friday afternoon. A developer changed how the system rounded fractional cents during currency conversion. The change passed code review. It passed manual QA — someone clicked through the checkout flow once, saw a normal price, and approved it. It shipped.

By Monday morning, the company had overcharged roughly 1,400 customers by amounts ranging from a few cents to several dollars each, and undercharged a few hundred more by larger amounts, thanks to a sign error in an edge case nobody had manually tested: a mid-cycle plan downgrade combined with a prorated refund. Support tickets flooded in. The finance team spent two weeks reconciling the ledger. Two customers with visible bank statement discrepancies escalated to a chargeback dispute. The total cost — refunds, support hours, reconciliation, and one very uncomfortable call with a payment processor — landed close to $80,000.

One integration test, covering the downgrade-plus-refund path with a handful of assertions, would have caught the sign error in about four seconds, every time it ran, for the rest of the system's life.

I open with this story because it is not unusual. I have seen versions of it — different industries, different bugs, same root cause — more times than I can count. Afterward, the business owner always asks the same question: "Wasn't this what QA was for?" The honest answer is that manual QA catches what a human thinks to click. Automated tests catch what nobody thinks to click, and they do it on every single deploy, forever, for a fraction of the cost of the incident they prevent.

The Math Most Founders Never Run

There is a widely cited claim — often traced to IBM's Systems Sciences Institute — that a bug costs roughly 100 times more to fix in production than during the requirements phase. That exact figure has been challenged over the years; the original study is hard to pin down, and the number has been repeated so often it has become folklore more than data. I do not treat it as gospel, and neither should you.

But strip away the specific multiplier, and the underlying pattern holds up in every engagement I have worked on: a defect caught by a unit test costs a few minutes of a developer's time. The same defect caught during manual testing costs an hour or two of QA time, plus a fix-and-retest cycle. Caught after release, it costs incident response, customer support, a rushed hotfix built under pressure, and — depending on the bug — reputational damage or compliance exposure that no amount of engineering time can undo after the fact. The direction of that curve is not in dispute, even when the exact multiplier is.

The cost of a testing shortcut does not disappear when you skip it. It moves downstream, attaches interest, and lands on a part of the business that has no ability to negotiate with it — your customers, your support team, or your finance department during month-end close.

Why Smart Business Owners Skip Testing Anyway

I rarely meet a founder who thinks testing is a bad idea in the abstract. What I meet is founders who experience it as a tax on shipping speed, paid upfront, for a benefit that stays invisible until the day it isn't.

A few patterns show up consistently:

Testing produces no visible feature. A new button demos well in a sprint review. A test suite demos as nothing — until it silently prevents an outage six months later.
The team is small, and every hour is precious. With three developers, spending two of those hours writing tests instead of features feels like the more expensive option, even when it is the cheaper one over a 12-month horizon.
Nobody has been burned yet. In my experience, testing discipline gets adopted reactively far more often than proactively. Most companies I bring automated testing into have already had their $80,000 incident. Few adopt it purely on projected risk.
"We'll add tests later" quietly becomes permanent. This is the same dynamic I describe in my article on why cheap software development ends up costing more — deferred quality work rarely gets picked back up once the deadline pressure that caused the deferral is simply replaced by the next deadline.

None of these reasons are irrational. They are just short-sighted in a specific, predictable, and expensive way.

The Testing Pyramid, Without the Jargon

Engineers talk about the "testing pyramid" as if it explains itself. For a business owner, here is the translation:

Unit tests (the base — most of your tests should live here). These check one small piece of logic in isolation — a pricing calculation, a discount rule, a date conversion. They run in milliseconds, cost the least to write and maintain, and catch the largest number of bugs per dollar spent.
Integration tests (the middle layer). These check that pieces work correctly together — your API talking to your database, your billing service talking to your payment processor. The subscription bug I described above lived here.
End-to-end tests (the top, and the smallest layer). These simulate a real user clicking through your actual application. They are the most realistic, the slowest to run, and the most expensive to maintain, which is why they should cover only your handful of most critical user journeys, not everything.

The mistake I see most often is businesses inverting this pyramid entirely: skipping unit and integration tests and relying on a thin layer of manual clicking, or over-investing in brittle end-to-end tests that break every time the interface changes and get disabled within a year out of sheer frustration.

What "Enough" Testing Actually Looks Like for an SMB

I do not recommend 100% code coverage to any client, ever. It is expensive, it slows delivery without a proportional reduction in risk, and it often means testing trivial code just to make a coverage percentage look good on a dashboard. Instead, I recommend prioritizing by consequence:

Money movement first. Anything touching billing, payroll, invoicing, or payment processing gets automated test coverage before anything else. In fintech and B2B SaaS work, this is non-negotiable.
Anything with compliance or legal exposure. Data handling, permissions, and access-control logic — the kind of thing that turns into a breach notification if it fails silently.
Your core user journey. The three to five flows that, if broken, stop customers from doing the thing they pay you for.
Code that changes often. Stable, rarely-touched code is a lower priority than the module your team modifies every sprint, because every change is a fresh opportunity to introduce a regression.
Anything that has broken before. A bug that shipped once will ship again in a different form unless a test exists to prevent it. I write a regression test for every production incident I am called in to fix, no exceptions.

For a typical SMB application, this usually means a solid base of unit tests covering business logic, integration tests around your three or four riskiest system boundaries, and a small number of end-to-end tests covering signup, checkout, and whatever transaction is core to your revenue. That is a realistic, fundable scope, not an open-ended engineering initiative.

Get Ahead of the Bug That Hasn't Happened Yet

The uncomfortable truth is that most companies do not decide to invest in automated testing. They get forced into it by an incident expensive enough to change the conversation. I would rather help a client make that decision on their own terms, on a Tuesday afternoon in a planning meeting, than have it made for them by a production outage on a Friday night.

This connects to a broader pattern I see across failed and struggling software projects — the same underestimation of "invisible" engineering discipline I cover in my article on common software project mistakes founders make. Testing is rarely the reason a project looks impressive in a demo. It is consistently the reason a project still works, unmodified in its core logic, three years later.

If you are not sure whether your current testing coverage matches your actual business risk, that is a conversation worth having before your next deploy, not after your next incident.

Let's talk through your situation.