The night before your internal quarterly business review, you dream you are already in the room. Not the board meeting. The internal one. The rehearsal before the rehearsal, where the slides are still editable and finance is deciding which line items deserve the red box.
In the dream, every slide is an invoice.
The consulting partner invoice is seven pages long. Page one says "strategic delivery acceleration." Page seven has change order number fourteen. Nobody flinches. The Scrum and agile coaching layer floats to the front with words like predictability, alignment, maturity, operating cadence, and continuous improvement. Nobody asks what it returned to EBITDA.
Four consultants from four different firms are standing by the whiteboard again. One for each year you hired someone to finally figure out the economics of software delivery. They taught your leadership team the basics of measuring software return on investment: value-stream maps, cost of delay, the laminated one-page model, and the breakout exercise where every table connected delivery work to business value.
Everyone nodded in the dream exactly the way they nodded in the real workshop. Your leadership team never caught on.
That is why you are doing this at all.
Not because tokens are magic. Because the Total Cost of Ownership of creating software in your organization is still unknown. You spent four years and four consulting firms trying to get that number. They did not deliver it. The irony should bother everyone in the room.
The delivery-management layer gets called connective tissue. The release train gets called coordination. Quarterly planning gets called necessary. The Jira hygiene initiative gets called discipline. The transformation office gets called governance. The offshore pod gets called capacity. The staff augmentation contract gets called flexibility. Every slide passes.
Then the token invoice appears. One line in the cloud report: one hundred eighteen thousand dollars.
The room wakes up inside the dream. Suddenly everyone has fiscal responsibility.
That is the part that stays with you when you actually wake up. Not the number. The selectivity. The company can spend seven figures on consulting partners, offshore capacity, staff augmentation, agile coaching, delivery management, and planning ceremonies without one Chief Financial Officer-ready sentence about earnings before interest, taxes, depreciation, and amortization.
But tokens get the emergency meeting.
Apparently fiscal responsibility has a trigger word, and the word is tokens.
That is the sentence a middle-layer director wants to say out loud:
"So it was okay to waste tons of money with bad consulting partners, but tokens are too much money?"
Do not say it that way in the meeting. It will feel good for four seconds, and then the Chief Financial Officer will ask for the numbers.
Bring them.
Bring every capacity invoice and one denominator finance can recognize: accepted production outcomes.
The ask is not permission to spend recklessly on tokens. The ask is permission to measure the value stream, find the true cost of creating software, and connect that cost back to earnings before interest, taxes, depreciation, and amortization. If software delivery is supposed to increase revenue, reduce expense, protect margin, or lower risk, the production system that creates software needs an economic model.
Right now most companies have invoices, headcount plans, ceremonies, and vibes. That is not a model.
Replacing consulting and coaching dollars with token dollars is not the point by itself. Replacing unmeasured dollars with measured dollars is the point.
The Chief Financial Officer is right to circle the token line.
Finance sees a new variable cost growing from forty two thousand dollars a month to one hundred eighteen thousand dollars a month, and finance asks whether that becomes two hundred fifty thousand dollars by the fourth quarter. That is governance doing its job.
The mistake is pretending the token line is the only place engineering capacity gets bought. Before this invoice existed, the company already bought extra capacity through offshore teams, staff augmentation, systems integrators, vendor professional services, coaching layers, delivery-management layers, release trains, quarterly planning, maturity assessments, and tooling nobody opens until the Monday before the steering committee.
Those were token bills too. They just arrived with nicer nouns.
Nobody asks the agile coach to tie their retainer to accepted production outcomes. Nobody asks the quarterly planning summit to defend its earnings before interest, taxes, depreciation, and amortization contribution. Nobody asks the delivery-management layer how much decision latency it removed last month. Nobody asks the product operating model workshop why the roadmap still ships at the same speed six months later.
The invoice has the right cultural costume, so it passes.
Tokens do not have the costume yet.
Run the simpler test.
If a company walked in tomorrow with an integrated development environment that cost twelve thousand dollars per engineer per year and made your engineering organization forty percent faster, you would buy it.
You would not ask each engineer to justify every save, autocomplete, refactor, test run, or compile. You would not put a daily cap on how many times they could use the debugger. You would not make a senior developer explain whether this particular code search deserved the premium tier.
You would change governance to exploit it.
If a tool makes software creation materially faster, the correct response is not to meter the tool until it behaves like last year's integrated development environment budget. The correct response is to change the production system around the new speed: review policy, tests, release gates, security checks, architecture approval, product intake, budgeting, and measurement.
If the integrated development environment made the team forty percent faster and your governance still made every change wait twelve days for review, the integrated development environment did not fail.
Your operating model did.
That is what is happening with tokens. The spend is being evaluated like a seat license while the capability is changing the economics of software creation.
So do not justify tokens against zero. Zero was never the baseline. Justify tokens against the capacity market your company already used.
This is the replacement-cost view I would bring to finance.
Offshore delivery pod: it promised cheaper capacity. Measure accepted work per month, rework rate, cycle time, and internal review hours.
Staff augmentation: it promised more hands quickly. Measure time to productive contribution, supervision load, and defect escape rate.
Systems integrator: it promised faster program delivery. Measure what actually shipped, change-order cost, and knowledge retained.
Vendor professional services: it promised product-specific speed. Measure implementation time, post-launch support load, and the dependency it created.
Scrum and agile coaching layer: it promised predictability and continuous improvement. Measure ceremony cost, management load, cycle time, accepted outcomes, and decision latency.
AI tokens and agents: they promise more output from people who already know the system. Measure cycle-time change, accepted outcomes, escaped defects, and cost of delay avoided.
The token invoice is an input cost. So was the offshore invoice, the consulting partner invoice, the agile coaching spend, the delivery-management layer, and the "temporary" staff augmentation contract that stayed for nineteen months because nobody wanted to admit the project still needed the people.
The question is not which input looked smallest when procurement approved it. The question is which input produced accepted work in production at the lowest total cost. The uncomfortable part is that you probably have the token number and not the accepted-outcome number for any of them.
That is why the token bill feels expensive. It is visible. The other waste got promoted into process.
Let me do the math in the ugly way, because this is the math a director can run before Thursday.
Take an offshore pod. Six engineers through a vendor at a blended eighty five dollars an hour. At one hundred sixty hours a month, that pod costs eighty one thousand six hundred dollars a month before internal management load.
Now count outcomes, not hours. In the last ninety days, the pod completed twenty-four tickets. Fourteen were accepted without major rework. Six came back for material changes. Four were closed or superseded because requirements moved before the work landed.
That gives you a first-pass acceptance rate of fifty eight percent. The capacity you bought was not nine hundred sixty clean engineering hours a month. It was nine hundred sixty nominal hours multiplied by the rate at which those hours turned into accepted work.
Now add the cost nobody put on the vendor invoice. One senior internal engineer spent eight hours a week reviewing, explaining context, rewriting specifications, and cleaning up integration issues. At two hundred eighty thousand dollars fully loaded, that engineer costs about one hundred thirty five dollars an hour. A product manager spent four hours a week clarifying tickets across time zones. The pod did not cost eighty one thousand six hundred dollars. It cost roughly eighty eight thousand dollars before delay, rework drag, support tail, and the meetings everyone pretended were normal.
If that pod shipped two accepted production outcomes a month, you paid about forty four thousand dollars per accepted outcome. If it shipped four, you paid twenty two thousand dollars. If it shipped one and created a support tail, you paid far more than the hourly rate ever admitted.
Offshore is not cheap because the hourly rate is cheap. Offshore is cheap only when accepted outcomes are cheap.
Now put the token line next to it.
Your best internal team is six engineers. They already know the system. You were already paying them before the token bill appeared. The question is whether the AI spend changes output enough to justify the new variable cost.
In March, that team spent twenty two thousand dollars on AI tools and inference. In April, they spent thirty one thousand dollars. Finance sees a forty one percent increase and starts circling. Good. Circle it. Then put the output next to it.
Before agents were part of the workflow, the team averaged three accepted production outcomes a month on this part of the roadmap. After the workflow changed, they averaged five. The extra two were accepted by product, deployed behind flags, monitored for thirty days, and not rolled back.
If the incremental AI spend is thirty one thousand dollars and the team produced two additional accepted outcomes, the gross incremental