Measure the ROI of AI across your company (without a data team)

A simple baseline-and-after method that turns "AI is helping, I think" into a defensible number, using a spreadsheet and a few weeks of honest measurement, no analytics team required.

What you'll have when you're done

A credible answer to "what is our AI spend actually returning?", built from before-and-after measurement on a few specific workflows, tracked in a spreadsheet. Not a perfect attribution model (you do not need one), but an honest, defensible read of where AI saved time or money and where it did not. Enough to decide what to scale, what to cut, and what to tell your board.

"It feels like it's helping" is not an answer your CFO accepts

At some point the AI line on your budget gets big enough that someone asks the obvious question: what did it return? And most CEOs have only a feeling. I have been in that seat, asked the question by my own CFO, and heard myself answer with "it's definitely helping" while knowing that was not an answer. It was a vibe with a budget attached. The uncomfortable truth was that I had deployed AI everywhere and measured it nowhere, so I had no way to defend the spend beyond enthusiasm, which is precisely the kind of line item that gets cut the moment money gets tight. The surveys back this up, lots of companies report using AI broadly, but only about a third can say they have scaled it with real impact, largely because they never measured. Without a number, the AI spend is a faith-based line item, and faith-based line items get cut in a downturn.

The good news: you do not need a data team or a dashboard project to measure this. You need a baseline (what a workflow cost before AI) and an after (what it costs now), on a few specific workflows. The one trap to avoid is the time-savings illusion: "we saved 8 hours a week" is worth zero unless those hours got reinvested into something that makes or saves money. Measure the realized value, not the theoretical hours.

What you need first

The specific workflows where you deployed AI (ideally the ones from your 30-day rollouts).
A spreadsheet. That is the whole tooling. Not a BI project.
Baseline data, which means measuring before you deploy, or reconstructing the pre-AI state honestly.
Usage data from your AI tool's admin console (which most business tiers provide).

Step-by-step

Step 1Measure the baseline before you deploy (or reconstruct it)

For each target workflow, capture the pre-AI state: time per task, volume, error rate, and any direct cost. Measure for at least a few weeks so you are not fooled by a single odd week. If you already deployed without a baseline, reconstruct it honestly from what the work used to take, an estimate you would defend, not a flattering guess. No baseline is the single most common reason AI ROI cannot be proven.

Step 2Measure the same metrics after

Once AI is in the workflow, measure the same things over a comparable period. Same metrics, same length, apples to apples. The delta between baseline and after is your raw signal.

Step 3Read it in three tiers

Look at the result at three levels, from easy to meaningful:

Tier 1 - Adoption: are people actually using it? (usage data from the admin console)
Tier 2 - Efficiency: did the workflow get faster / higher-volume / lower-error?
         (your baseline vs after)
Tier 3 - Impact: did that translate to revenue made or cost saved?
         (the number your CFO cares about)

Most companies stop at Tier 1 ("look, usage is up!"), which proves nothing. Push to Tier 3 wherever you can.

Each tier is harder to measure than the last, which is exactly why people stop early. Tier 1 is free, the admin console hands you usage numbers. Tier 2 takes your baseline-versus-after discipline. Tier 3 takes an honest judgment about whether the efficiency turned into money. The trap is that Tier 1 feels like proof ("80% of the team is using it weekly!") while telling you nothing about value, a team can use a tool enthusiastically and produce zero P&L impact. Treat Tier 1 as a prerequisite, not an answer: if usage is low, nothing downstream matters, but high usage is the start of the question, not the end of it.

Step 4Only count time-savings that got reinvested

This is the honesty step that makes your number real. If a workflow now takes 2 hours instead of 10, that 8 hours is only ROI if it went toward revenue or eliminated a cost (a hire you did not need to make, more output shipped). If it just evaporated into a less-busy week, the realized value is closer to zero. Count what was reinvested, not what was theoretically freed. A smaller honest number beats a big fake one.

Here is the full calculation, illustrative, continuing the support team from a 30-day rollout:

Baseline: 4 agents, first responses at 6 minutes each, the team maxed out, and a 5th hire (~$55K loaded) on next quarter's plan to handle rising volume.

After: first responses at 2.5 minutes, same quality, and the team now absorbs the higher volume without the 5th hire.

Realized value: the avoided hire, ~$55K/year. That is the Tier-3 number, not "we saved 3.5 minutes a ticket."

AI cost: the business-tier seats, call it ~$4K/year.

Honest ROI: roughly $55K returned against ~$4K spent, on this one workflow.

Notice what made it real: the saved time did not vanish into shorter days, it absorbed growth that would otherwise have cost a headcount. Had the team simply clocked out a little earlier, the honest Tier-3 number would have been near zero, no matter how good the efficiency chart looked. That is the discipline the whole method turns on.

Step 5Track it in a sheet and revisit quarterly

Keep it in a simple spreadsheet: workflow, baseline, after, tier-3 impact, notes. Revisit quarterly. Do not chase a perfect attribution model, directional and honest beats precise and gamed, and a number you can explain in one sentence is worth more than a dashboard no one trusts.

How you'll know it's working

You can answer the CFO's question with a real, defensible number and point to the workflows behind it. You make better scaling decisions, double down on the workflows with Tier-3 impact, cut the ones stuck at Tier 1. And your AI budget stops being a faith-based line item, because you can show what it returns.

When it breaks

You have no baseline. The fatal one. Measure before deploying, or reconstruct the pre-AI state honestly. Without it there is no ROI to prove.
You're celebrating Tier 1. Usage is not impact. Push to efficiency and dollars.
The time-savings don't show up in the P&L. Because they were not reinvested. Count realized value, not freed hours.
You're trying to build a perfect model. Stop. Directional, honest, and explainable beats precise and gamed. A spreadsheet is enough.
The team games the metric. Once "tickets per agent" is the tracked number, replies get faster and sloppier. Always pair an efficiency metric with a quality proxy (CSAT, edit rate, escalations) so you catch a speed gain that is really a quality loss in disguise.
You count the same saved hour in three places. The 5th hire you avoided is one number; do not also count "8 hours saved" and "more output" on top of it as if they were separate wins. Pick the single cleanest expression of the realized value and stop, or your ROI inflates into something you cannot defend.

Make it yours. You do not need to measure every workflow, measure the handful where the spend is real and the stakes are visible to your board. For a customer-facing workflow, the Tier-3 number is often revenue (faster response, higher conversion or retention); for a back-office one, it is usually cost (avoided or redeployed headcount). Frame the final number in the language your specific CFO and board already use, "this returned roughly 10x its cost on the two workflows we measured" lands harder than a dashboard, and it is the sentence that keeps the AI budget funded.

Where this fits in your harness

Measurement is what makes the whole AI program defensible and durable. It closes the loop on the 30-day rollout (the rollout creates the change; this proves it) and on AI office hours (which drive the adoption you are measuring). It is also the discipline that lets you confidently expand AI across the company, because every expansion is backed by a number, not a hope.

The architecture behind this workflow.

Two operator manuals for the same job, run two ways: OpenCLAW for the always-on harness, Claude Code for the focused-work CLI. Pick one, or get the bundle for $149.

Browse the books · $99 each

Want one workflow like this taken apart end-to-end every week? The Tuesday Pro Deep Dive · $39/mo.