AI ROI is the measurable business outcome attributable to an AI deployment, divided by the total cost of that deployment over a defined period — typically 12 to 36 months. That sentence does most of the work, but every word in it carries weight. Measurable rules out vibes-based reporting. Attributable rules out coincidence. Total cost rules out the favorite trick of comparing API spend to revenue gain. Defined period rules out the indefinite optimism that lets failed projects keep their pulse for years.
This is the AI ROI framework we use to shape every Remolda engagement. It is built around four archetypes (each with a different formula), the hidden costs that quietly inflate the denominator, and the realistic deltas that survive contact with a CFO.
Why most AI ROI claims fall apart in audit
Before formulas: the failure modes. The claim "our AI deployment saved $2M last year" almost never survives an audit, and not because of fraud. It falls apart in three predictable ways.
No baseline. The team did not measure the relevant business KPI before deployment. There is no defensible "what would have happened anyway" counterfactual, so any post-deployment improvement is a story, not a number.
Cherry-picked window. The reporting period was the three months when the deployment performed well. The next three months — when the model drifted, the input distribution shifted, or the team stopped maintaining it — are not in the chart.
Cost denominator missing components. API spend is in. The 0.3 FTE of engineering maintenance is not. The 0.2 FTE of an analyst monitoring outputs is not. The retraining at month nine is not. The integration into legacy systems was capitalized as software, not allocated to AI cost.
Fix all three and the AI ROI number stops looking heroic. That's correct. Heroic numbers are why most boards stopped trusting AI ROI claims by 2024.
The four archetypes of AI ROI
Every AI deployment fits into one (or a hybrid) of four ROI archetypes. The formula is different for each. Confusing them is one of the reasons published case studies don't match the buyer's experience.
Archetype 1: Cost reduction
The AI replaces or compresses work that humans were doing. The benefit is measured in labor cost saved or capacity reallocated.
Formula:
Annual benefit = (hours reclaimed per FTE × number of FTEs × fully-loaded hourly cost) − exception-handling cost
Annual cost = build cost (amortized over expected lifetime) + inference + maintenance + monitoring + governance
ROI % = (Annual benefit − Annual cost) / Annual cost × 100
Realistic delta: 15 to 40 percent of the targeted workflow's labor cost in year one for well-scoped projects. Lower in regulated workflows where exception-handling is costly. Higher in unstructured-text workflows that legacy automation could not touch.
Failure mode: Reclaimed hours don't translate to FTE reduction unless leadership actually reorganizes. "Time saved" that goes back into low-value work is a productivity claim, not a cost saving.
Archetype 2: Revenue growth
The AI lets the organization sell more — through faster response, better personalization, broader coverage, or capabilities the competition lacks.
Formula:
Annual benefit = incremental revenue × gross margin %
Incremental revenue =
(lift in conversion rate × baseline volume) +
(lift in average order value × incremental orders) +
(new revenue from capabilities that didn't exist before)
Annual cost = build cost (amortized) + inference + maintenance + monitoring
ROI % = (Annual benefit − Annual cost) / Annual cost × 100
Realistic delta: 3 to 12 percent conversion lift in the targeted segment. AOV lift is highly product-dependent. New-revenue claims are the most fragile — they require attribution discipline most companies don't have.
Failure mode: Revenue growth is multivariate. Without a controlled experiment (A/B test or geo holdout), attribution to the AI deployment is rarely defensible. Treat revenue ROI claims that lack experimental design as estimates, not measurements.
Archetype 3: Risk reduction
The AI reduces the frequency or severity of bad outcomes — fraud losses, compliance violations, security incidents, churn, errors. The benefit is the avoided cost.
Formula:
Annual benefit = (baseline incident rate × severity per incident × volume) − (post-AI incident rate × severity × volume)
Annual cost = build cost (amortized) + inference + maintenance + monitoring + audit
ROI % = (Annual benefit − Annual cost) / Annual cost × 100
Realistic delta: 20 to 60 percent reduction in detectable incident classes. The classes the AI cannot see (novel fraud patterns, novel attack vectors) keep their rate.
Failure mode: Risk-reduction ROI is asymmetric. The benefit is bounded by historical incident rate, but a single false negative on a high-severity event (missed regulatory violation, missed clinical anomaly) can wipe out years of savings. Risk-reduction ROI should always be reported with the worst-case false-negative scenario.
Archetype 4: Capability unlock
The AI enables an activity that was not feasible before — analyzing every customer call instead of a 2 percent sample, screening every contract instead of the high-value ones, monitoring every patient signal in real time. The benefit is the value of decisions made on full information rather than a sample.
Formula:
Annual benefit = (decision quality lift × decision frequency × decision value)
Decision quality lift =
fraction of decisions that change outcome × marginal outcome value
Annual cost = build cost + inference + maintenance + monitoring + governance
ROI % = (Annual benefit − Annual cost) / Annual cost × 100
Realistic delta: This is the archetype where AI delivers genuinely large multiples — 5x to 50x — if the underlying decision is valuable enough. Capability unlock ROI is hardest to estimate before deployment because the team has never operated with full information.
Failure mode: Capability-unlock projections often anchor on the maximum theoretical benefit. The realistic benefit is gated by how much of the new information actually changes a decision. If the team would have done the same thing with 100 percent of the data as with 2 percent, the ROI is zero.
The hidden costs that wreck the denominator
Every formula above has a denominator: total cost. The line items most teams miss:
- Integration cost. The AI sits inside a system. Integration with the CRM, ERP, ticketing, document repo is rarely simple. Allocate 30 to 60 percent of the build cost as integration.
- Observability and monitoring. Logging input and output of every model call. Alerting on latency, error rate, and content drift. Costs 10 to 20 percent of the inference budget in year one.
- Retuning and retraining. Foundation models update; input distribution drifts; business definitions change. Budget 10 to 20 percent of initial implementation cost annually for retuning.
- Governance overhead. Privacy review, security review, model governance reviews, exception handling, audit prep. Roughly 10 percent of build cost annually for regulated industries.
- Change management. Training, communications, incentive alignment, role redesign. Often the largest line item — 20 to 40 percent of build cost — and the one most likely to be omitted from the AI budget because it sits in HR or operations.
- Vendor lock-in tax. If the architecture binds to one foundation-model provider, the second-year inference price is whatever that vendor decides it is. Budget for portability up front.
The first time a team builds an AI ROI calculator, the denominator usually doubles or triples after these are added. ROI numbers that started looking like 400 percent settle into the 30 to 80 percent range. The 30 to 80 percent range is where defensible AI ROI lives.
A worked example: claims processing automation
A property and casualty insurer processes 50,000 claims per month. Average handle time is 45 minutes, fully-loaded analyst cost is $80 per hour. They deploy an AI workflow that auto-resolves 60 percent of straightforward claims and routes 40 percent to humans.
Annual benefit (cost reduction):
- Hours reclaimed: 50,000 × 12 × 0.6 × 0.75 hours = 270,000 hours
- Labor saved at $80 per hour = $21.6M
- Exception-handling cost (40 percent more time on routed claims, $80 × 0.25 hours × 240,000 claims) = $4.8M
- Net annual benefit = $16.8M
Annual cost:
- Build cost amortized over 3 years: $1.5M / 3 = $500K
- Integration cost amortized: $750K / 3 = $250K
- Inference: $0.30 per claim × 600,000 claims = $180K
- Maintenance and monitoring: $200K
- Governance and audit: $150K
- Change management amortized: $400K / 3 = $133K
- Total: $1.4M
ROI = ($16.8M − $1.4M) / $1.4M × 100 = 1100%
Even adjusted down 50 percent for revised assumptions, the ROI is 550 percent. This is what defensible AI ROI looks like in cost-reduction archetype workflows where the labor base is large and the work is structured. Most AI deployments do not look like this. Most are smaller in scope, smaller in benefit, and have much tighter ROI ranges.
What honest AI ROI looks like at portfolio level
Across 30 to 50 deployed AI workflows over three to five years, an honest organization sees:
- 30 to 40 percent that hit the projected ROI range and are kept in production.
- 30 to 40 percent that underperform projections (50 to 80 percent of expected benefit) but still deliver positive ROI and are kept.
- 20 to 40 percent that fail to deliver and are sunset within 18 months.
The 20 to 40 percent failure rate is not a sign of a bad program. It is the cost of running an experimental portfolio. AI deployment is closer to drug development economics than to traditional IT — high failure rate at the trial stage, large multiples on the winners, and the discipline to kill projects that aren't working before they consume the budget for ones that would.
How to set up AI ROI tracking from day one
The single most valuable thing a team can do for future AI ROI claims is to measure the baseline before deploying anything. Specifically:
- Pick the targeted workflow.
- Identify the two or three KPIs that the AI will plausibly move (cycle time, error rate, conversion rate, churn rate, etc.).
- Capture six to twelve months of historical baseline.
- Set the deployment evaluation period (12 months minimum, 36 preferred).
- Define what "success" looks like as a number, not as a story.
- Pre-register the success criteria with the executive sponsor before deployment, in writing.
The teams that do this become impossible to dismiss when the AI works. The teams that don't end up arguing for budget every year, with anecdotes against a finance team that wants numbers.
What to do next
If you are about to start an AI deployment and have not built the ROI model: do it before signing the implementation contract. Use the formula matching your archetype, fill in real numbers from the workflow you are targeting, and get the model reviewed by someone in finance who is not on the AI team. The first review will reveal the gaps — costs you missed, benefits you overestimated, attribution you can't defend.
If you are running deployments today without ROI tracking: the second-best time to start is now. Capture the current state of the targeted workflow as baseline, even if "before" is six months ago, and begin defensible measurement going forward. The ROI numbers you get will not be as clean as if you had started right, but they will be defensible — which is the difference between continuing to fund the program and quietly losing it in next year's budget.
If you want help building either kind of model, book a working session. The output of one ninety-minute working session is a populated ROI worksheet for your top one to three candidate workflows, with sources for every input — defensible the day you walk out with it.