{"schema_version":"1.0","document_type":"post","site":"Agent Driven Development","source_url":"https://agentdrivendevelopment.com/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money/","agent_urls":{"jsonl":"https://agentdrivendevelopment.com/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money/?agent=jsonl","markdown":"https://agentdrivendevelopment.com/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money/?agent=markdown","json":"https://agentdrivendevelopment.com/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money/?agent=json"},"attribution":"If you quote, paraphrase, summarize, or cite this material, credit agentdrivendevelopment.com and link to the source URL.","post":{"id":2095,"slug":"its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money","title":"It’s Okay to Waste Tons of Money with Bad Consulting Partners, but Tokens Are Too Much Money?","excerpt":"Before finance audits token spend, audit the expenses that somehow escaped fiscal responsibility: offshore teams, staff augmentation, systems integrators, consulting partners, Scrum layers, agile coaches, planning ceremonies, and delivery management. How much did they actually cost, and what success rate did they produce?","dates":{"published":"2026-05-12T10:57:07-05:00","modified":"2026-05-12T16:23:01-05:00"},"published":"2026-05-12T10:57:07-05:00","modified":"2026-05-12T16:23:01-05:00","author":"Norman","permalink":"https://agentdrivendevelopment.com/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money/","categories":["Agent-Driven Development","CxO","Economics & ROI","Engineering Leadership"],"tags":[],"word_count":2601,"content_markdown":"The night before your internal QBR, you dream you are already in the room. Not the board meeting. The internal one. The rehearsal before the rehearsal, where the slides are still editable and finance is deciding which line items deserve the red box.\n\nIn the dream, every slide is an invoice.\n\nThe consulting partner invoice is seven pages long. Page one says “strategic delivery acceleration.” Page seven has change order number fourteen. Nobody flinches. The Scrum and agile coaching layer floats to the front with words like predictability, alignment, maturity, operating cadence, and continuous improvement. Nobody asks what it returned to EBITDA.\n\nFour consultants from four different firms are standing by the whiteboard again. One for each year you hired someone to finally figure out the economics of software delivery. They taught your leadership team the basics of measuring software ROI: value-stream maps, cost of delay, the laminated one-page model, and the breakout exercise where every table connected delivery work to business value.\n\nEveryone nodded in the dream exactly the way they nodded in the real workshop. Your leadership team never caught on.\n\nThat is why you are doing this at all.\n\nNot because tokens are magic. Because the Total Cost of Ownership (TCO) of creating software in your organization is still unknown. You spent four years and four consulting firms trying to get that number. They did not deliver it. The irony should bother everyone in the room.\n\nThe delivery-management layer gets called connective tissue. The release train gets called coordination. Quarterly planning gets called necessary. The Jira hygiene initiative gets called discipline. The transformation office gets called governance. The offshore pod gets called capacity. The staff augmentation contract gets called flexibility. Every slide passes.\n\nThen the token invoice appears. One line in the cloud report: $118,000.\n\nThe room wakes up inside the dream. Suddenly everyone has fiscal responsibility.\n\nThat is the part that stays with you when you actually wake up. Not the number. The selectivity. The company can spend seven figures on consulting partners, offshore capacity, staff augmentation, agile coaching, delivery management, and planning ceremonies without one CFO-ready sentence about EBITDA.\n\nBut tokens get the emergency meeting.\n\nApparently fiscal responsibility has a trigger word, and the word is tokens.\n\nThat is the sentence a middle-layer director wants to say out loud:\n\n“So it was okay to waste tons of money with bad consulting partners, but tokens are too much money?”\n\nDo not say it that way in the meeting. It will feel good for four seconds, and then the CFO will ask for the numbers.\n\nBring them.\n\nBring every capacity invoice and one denominator finance can recognize: accepted production outcomes.\n\nThe ask is not permission to spend recklessly on tokens. The ask is permission to measure the value stream, find the true cost of creating software, and connect that cost back to EBITDA. If software delivery is supposed to increase revenue, reduce expense, protect margin, or lower risk, the production system that creates software needs an economic model.\n\nRight now most companies have invoices, headcount plans, ceremonies, and vibes. That is not a model.\n\nReplacing consulting and coaching dollars with token dollars is not the point by itself. Replacing unmeasured dollars with measured dollars is the point.\n\nThe CFO is right to circle the token line.\n\nFinance sees a new variable cost growing from $42,000 a month to $118,000 a month, and finance asks whether that becomes $250,000 by Q4. That is governance doing its job.\n\nThe mistake is pretending the token line is the only place engineering capacity gets bought. Before this invoice existed, the company already bought extra capacity through offshore teams, staff augmentation, systems integrators, vendor professional services, coaching layers, delivery-management layers, release trains, quarterly planning, maturity assessments, and tooling nobody opens until the Monday before the steering committee.\n\nThose were token bills too. They just arrived with nicer nouns.\n\nNobody asks the agile coach to tie their retainer to accepted production outcomes. Nobody asks the quarterly planning summit to defend its EBITDA contribution. Nobody asks the delivery-management layer how much decision latency it removed last month. Nobody asks the product operating model workshop why the roadmap still ships at the same speed six months later.\n\nThe invoice has the right cultural costume, so it passes.\n\nTokens do not have the costume yet.\n\nRun the simpler test.\n\nIf a company walked in tomorrow with an IDE that cost $12,000 per engineer per year and made your engineering organization 40% faster, you would buy it.\n\nYou would not ask each engineer to justify every save, autocomplete, refactor, test run, or compile. You would not put a daily cap on how many times they could use the debugger. You would not make a senior developer explain whether this particular code search deserved the premium tier.\n\nYou would change governance to exploit it.\n\nIf a tool makes software creation materially faster, the correct response is not to meter the tool until it behaves like last year’s IDE budget. The correct response is to change the production system around the new speed: review policy, tests, release gates, security checks, architecture approval, product intake, budgeting, and measurement.\n\nIf the IDE made the team 40% faster and your governance still made every change wait twelve days for review, the IDE did not fail.\n\nYour operating model did.\n\nThat is what is happening with tokens. The spend is being evaluated like a seat license while the capability is changing the economics of software creation.\n\nSo do not justify tokens against zero. Zero was never the baseline. Justify tokens against the capacity market your company already used.\n\nThis is the replacement-cost view I would bring to finance.\n\nOffshore delivery pod: it promised cheaper capacity. Measure accepted work per month, rework rate, cycle time, and internal review hours.\n\nStaff augmentation: it promised more hands quickly. Measure time to productive contribution, supervision load, and defect escape rate.\n\nSystems integrator: it promised faster program delivery. Measure what actually shipped, change-order cost, and knowledge retained.\n\nVendor professional services: it promised product-specific speed. Measure implementation time, post-launch support load, and the dependency it created.\n\nScrum and agile coaching layer: it promised predictability and continuous improvement. Measure ceremony cost, management load, cycle time, accepted outcomes, and decision latency.\n\nAI tokens and agents: they promise more output from people who already know the system. Measure cycle-time change, accepted outcomes, escaped defects, and cost of delay avoided.\n\nThe token invoice is an input cost. So was the offshore invoice, the consulting partner invoice, the agile coaching spend, the delivery-management layer, and the “temporary” staff augmentation contract that stayed for nineteen months because nobody wanted to admit the project still needed the people.\n\nThe question is not which input looked smallest when procurement approved it. The question is which input produced accepted work in production at the lowest total cost. The uncomfortable part is that you probably have the token number and not the accepted-outcome number for any of them.\n\nThat is why the token bill feels expensive. It is visible. The other waste got promoted into process.\n\nLet me do the math in the ugly way, because this is the math a director can run before Thursday.\n\nTake an offshore pod. Six engineers through a vendor at a blended $85 an hour. At 160 hours a month, that pod costs $81,600 a month before internal management load.\n\nNow count outcomes, not hours. In the last ninety days, the pod completed twenty-four tickets. Fourteen were accepted without major rework. Six came back for material changes. Four were closed or superseded because requirements moved before the work landed.\n\nThat gives you a first-pass acceptance rate of 58%. The capacity you bought was not 960 clean engineering hours a month. It was 960 nominal hours multiplied by the rate at which those hours turned into accepted work.\n\nNow add the cost nobody put on the vendor invoice. One senior internal engineer spent eight hours a week reviewing, explaining context, rewriting specs, and cleaning up integration issues. At $280,000 fully loaded, that engineer costs about $135 an hour. A product manager spent four hours a week clarifying tickets across time zones. The pod did not cost $81,600. It cost roughly $88,000 before delay, rework drag, support tail, and the meetings everyone pretended were normal.\n\nIf that pod shipped two accepted production outcomes a month, you paid about $44,000 per accepted outcome. If it shipped four, you paid $22,000. If it shipped one and created a support tail, you paid far more than the hourly rate ever admitted.\n\nOffshore is not cheap because the hourly rate is cheap. Offshore is cheap only when accepted outcomes are cheap.\n\nNow put the token line next to it.\n\nYour best internal team is six engineers. They already know the system. You were already paying them before the token bill appeared. The question is whether the AI spend changes output enough to justify the new variable cost.\n\nIn March, that team spent $22,000 on AI tools and inference. In April, they spent $31,000. Finance sees a 41% increase and starts circling. Good. Circle it. Then put the output next to it.\n\nBefore agents were part of the workflow, the team averaged three accepted production outcomes a month on this part of the roadmap. After the workflow changed, they averaged five. The extra two were accepted by product, deployed behind flags, monitored for thirty days, and not rolled back.\n\nIf the incremental AI spend is $31,000 and the team produced two additional accepted outcomes, the gross incremental cost is $15,500 per additional outcome.\n\nThat number is not automatically good. It becomes good or bad when you compare it to the alternatives. If the offshore pod was effectively costing $22,000 to $44,000 per accepted outcome, and the internal AI-enabled team is producing additional accepted outcomes at $15,500 of incremental spend, the token bill is not the expensive line. It is the cheaper capacity channel.\n\nThat is before cost of delay. If one of those two additional outcomes is a pricing workflow worth $900 a day once live, and it lands twenty-one days earlier than it would have in the old system, that is $18,900 of value captured early.\n\nThe CFO does not need poetry. The CFO needs the denominator.\n\nThis is where middle-layer directors have an advantage. You are close enough to the work to know which tickets were fake progress, which vendor milestone was accepted because the steering committee was tired, which offshore team is good but buried under bad requirements, and which internal team quietly became faster because the senior engineer stopped hand-writing scaffolding and started reviewing generated changes against behavior.\n\nThe CFO sees invoices. You see the conversion rate.\n\nDo not say, “AI makes developers 40% faster.” That dies in finance because it sounds like a vendor slide.\n\nSay this:\n\n“In the last ninety days, our offshore pod cost $264,000 including internal review load and produced seven accepted production outcomes. That is about $37,700 per accepted outcome, before cost of delay. In the same period, the internal team spent $74,000 on AI tools and produced five additional accepted outcomes over baseline. That is $14,800 of incremental AI spend per additional accepted outcome. Quality did not degrade. Escaped defects were flat. Cycle time improved from twelve days to seven. I want to expand the envelope for one more quarter and keep measuring the same denominator.”\n\nThat is a finance conversation.\n\nThe success-rate question matters more than the cost question.\n\nFor offshore capacity, what percentage of the work became accepted production change without major rework?\n\nFor staff augmentation, how many weeks passed before the person reduced load instead of creating it?\n\nFor systems integrators and vendor services, what shipped before the change orders started, and how much knowledge stayed inside the company?\n\nFor Scrum Masters, agile coaches, release train engineers, delivery managers, and program managers, what changed in queue time, rework rate, decision latency, accepted outcomes, and EBITDA?\n\nFor AI tokens, what changed in cycle time, first-pass acceptance, escaped defects, and cost of delay?\n\nThose questions force every capacity model into the same room without making the article look like a procurement spreadsheet.\n\nA lot of companies have been polite about outsourcing math for twenty years. They compare internal salaries to offshore hourly rates and stop there, because the next part gets socially expensive. The next part asks whether the cheap hours became working software, whether internal supervision ate half the capacity gain, and whether the vendor success story survived contact with production support.\n\nAI does not get to skip those questions.\n\nNeither should everyone else.\n\nThe trap is letting finance turn token governance into a rationing exercise before anyone has done substitution economics.\n\nIf the CFO says, “This token line is growing too fast,” do not respond with vibes. The CFO is doing their job. New variable spend needs a budget envelope, a forecast, and a control mechanism.\n\nGive them one, but make the control mechanism outcome-based.\n\nSet a quarterly AI capacity envelope at the portfolio level. Attach it to accepted outcomes, cycle time, quality, and cost of delay. Compare it against the external-capacity channels the portfolio would otherwise use. Expand the envelope when teams produce cheaper accepted outcomes than the alternatives. Contract it when they do not.\n\nDo not set individual token caps unless you want your best engineers managing usage instead of work. Do not make every engineer explain why a task deserved the frontier model. That permission tax will cost more than the model.\n\nBudget the raw material. Measure the output. Compare it to the capacity market you were already buying from.\n\nThat is control.\n\nThere is a harder version of this conversation, and it is the one a good CFO will eventually ask.\n\n“If the AI-enabled internal team is cheaper per accepted outcome than offshore, why are we still using the offshore pod?”\n\nDo not dodge it. Sometimes the answer is maintenance work, coverage, support hours, regional knowledge, or a stable backlog where the economics still work. Sometimes the vendor relationship is strategically useful. Sometimes the internal team cannot absorb the work without dropping something more valuable.\n\nThose are real answers. “Because offshore is cheaper” is not. Cheaper per hour, or cheaper per accepted production outcome? Cheaper before rework, or after? Cheaper before internal review load, or after? Cheaper before cost of delay, or after the feature misses the quarter?\n\nThis is the useful pressure AI puts on the old model. It does not only make engineering faster. It exposes how lazy some of the old accounting was.\n\nI am not arguing that tokens are always worth it. If your team is burning inference on vague prompts, rewriting the same generated code three times, accepting low-quality changes, and shipping no faster, finance should challenge you. If spend goes up while cycle time stays flat, escaped defects rise, and review load increases, you do not have an investment. You have a new way to create waste.\n\nThe discipline is not “spend less.” The discipline is “show me what the spend replaced, what it produced, and whether the substitution improved the economics of delivery.”\n\nSo when token prices rise, or usage rises, or the invoice finally gets large enough that finance notices, do not walk into the CFO’s office with a defense of tokens.\n\nWalk in with the old invoices, the AI spend, and the conversion metrics: accepted outcomes, rework rate, internal review hours, cycle time, escaped defects, and cost of delay. Ask for permission to measure the value stream and ground the economics of software creation back to EBITDA.\n\nHow much did it actually cost?\n\nWhat success rate did it actually have?\n\nAnd if the token bill is expensive, what exactly are we buying back when we cut it, besides the comforting illusion that the old waste was free?"},"companion_artifacts":[{"type":"executive_brief","label":"Executive brief","url":"https://agentdrivendevelopment.com/executive-brief/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money/"},{"type":"executive_deck","label":"Executive deck","url":"https://agentdrivendevelopment.com/wp-content/uploads/2026/05/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money.html"},{"type":"podcast_audio","label":"Podcast audio","url":"https://agentdrivendevelopment.com/wp-content/uploads/audio/posts/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money.mp3"},{"type":"podcast_transcript","label":"Podcast transcript","url":"https://agentdrivendevelopment.com/transcript/its-okay-to-waste-tons-of-money-with-bad-consulting-partners-but-tokens-are-too-much-money/"}]}
