The $14 Million Sentence
Three budget meetings, one distribution problem.
This is a story about a sentence nobody could write. It cost one firm fourteen million dollars to learn the sentence existed.
The firm is financial services. Twenty-two thousand employees, nine decades old, profitable in a way that stopped requiring explanation sometime in the last century.
There are three budget meetings in this story. Between them, an economics lesson the firm paid full retail for.
Meeting one
November 2024. The CIO has the room and a deck titled “Our AI Journey.”
Eighteen thousand seats. Copilots in every workflow. Fourteen million over two years. Slide nine is a robot hand shaking a human hand.
The CFO asks one question. “How will we know it worked?”
“Adoption,” the CIO says. “Usage. Engagement.”
Note what just happened, because it is the whole story in miniature. The CFO asked about outcomes. The CIO answered with inputs. Nobody registered the substitution, because the substitution is invisible when outcomes have no definition. Value requires a prior statement of what done looks like, and no such statement existed. So the firm instrumented what it could see: tokens, seats, logins.
The CFO writes her question in the margin. The commitment signs that afternoon.
The distribution
The platform goes live in February. Treat the next twelve months as a natural experiment. One model. Eighteen thousand users. Identical capability per token. The independent variable is what each token got pointed at.
An analyst in treasury fed it a quarter of hedging reports and asked what looked wrong. It found a duration mismatch the desk had carried for two years. Fixing it was worth $2.1 million. She received a shoutout in the team channel. Eleven thumbs-up. One rocket ship.
A VP in marketing ran one memo through eleven revisions. The eleventh was worse than the second. The memo announced a meeting.
An operations team automated its weekly close package. Nine hours per week, recovered, per person. The hours were never seen again. Output held flat. Calendars stayed full.
Plot the year and you get a power law. A handful of sessions produced nearly all the value. The vast middle produced reformatting. This is not a defect of the model. It is a property of the users, and it is exactly what incentive theory predicts. At the firm, compensation attaches to tenure and perceived usefulness. It detached from measured output decades ago. An employee handed a machine that does her work faces a clean choice: reinvest the surplus in output nobody measures, or consume it as slack. Slack pays. The treasury analyst is the exception that proves the rule. Her payoff was an emoji.
You cannot prompt your way out of a compensation plan.
Now the pricing problem. The meter charged the $2.1 million catch and the eleventh memo identically, because input pricing prices the mean. But value is not distributed around the mean. It follows a power law, which means the typical token is worth far less than the average token. The firm paid for the average and experienced the typical. The gap between those two numbers is variance, and under input pricing, one hundred percent of it sits on the buyer.
Meeting two
October 2025. The CFO’s turn.
“Fourteen million committed. What did we get?”
The CIO has numbers. Consumption up six hundred percent. Adoption at 81.
“Adoption of what?”
Someone tells the treasury story. It is a good story. It is also the only one, it is eight months old, and everyone present has heard it twice.
So the CFO runs a measurement exercise. One sentence per function head: the work the platform now performs, the definition of done, the dollar value of done. Not a paragraph. A sentence.
Zero sentences are produced. Not from concealment. The sentence does not exist anywhere in the firm, because writing it stopped being anyone’s job around the same time pay detached from output. The firm had instrumented everything except the thing.
The renewal is cut to a third. The board’s conclusion: AI is overhyped. The board forms a task force.
The data supports a narrower conclusion. The firm bought an input and graded it like an outcome. That is a measurement error, and measurement errors are fixable. The task force is not assigned to it.
Meeting three
October 2026. A new line item. It sits in the COO’s budget, where contractors live, not the CIO’s.
The vendor is unremarkable. Twelve people. A one-page website. Zero robot handshakes. The founder previously worked at a firm like this one. Her last performance review listed “urgency” as a development area. Organizations at scale select against variance. She was variance. The selection worked.
The contract is the interesting artifact. No seats. No tokens. Capacity, priced per FTE-equivalent, against the labor budget. This scope, this quality bar, this date, this price. The slip clause points at her. Procurement spent two weeks deciding whether she was software or staffing. The answer was yes.
Read the contract as what it is: a variance transfer running the opposite direction from the platform’s. Outcome pricing moves the risk of a worthless token onto the party that controls where tokens point. That is the oldest move in contract theory. Allocate the risk to whoever can manage it.
Her first deliverable is not software. It is a list of sentences. One per process: the work, the definition of done, the dollar value of done. Extraction takes three weeks and is the hardest part of the engagement. Harder than the integrations. Harder than the model. The firm is being asked to state what it wants, in writing, with a number attached, for the first time in living memory.
It is not smooth. By spring, two of the first ten processes fail outright. Under the platform, those failures would have been invisible, billed, and renewed. Under this contract they are her cost, itemized. The CFO finds this clarifying. For the first time, failure has an address.
By year end the budget has not shrunk. It has moved. IT line smaller, labor line larger. The margin note from 2024 finally has an answer, and the answer is a number.
Coda
The firm is not real. Every meeting in it is. I run a company on the vendor’s side of the third meeting. Discount accordingly.
The mechanics underneath are general.
Value per token is power-law distributed, and the exponent is set by incentives, not capability. Same model, different payoff functions, different distributions.
Input pricing charges the mean and delivers the typical. The spread is variance, and it sits entirely on the buyer until a contract moves it.
Enterprises cannot manufacture aligned incentives internally. Compensation detached from output is a structural feature, not a bug to patch. But they have purchased outcome definition from contractors for a century. Outsourcing is incentive alignment, bought retail.
So the spend does not cap when the audit comes. It re-prices. From inputs to outcomes. From IT budgets to labor budgets. From the party that cannot control variance to the party that can.
I made the casino version of this argument in Stop Buying Chips. This is the labor version.
The intelligence was never scarce. The definition of done was. The money flows to whoever writes the sentence.

Did it HAVE to be the marketing person that did the dumbest thing? But seriously, great post on AI maturity. It’s like the early days of electricity — when they electrified factories it didn’t change productivity because everything still ran off a central shaft. They needed to re-engineer the way the factory worked in order to derive benefit from electricity (spoiler alert — each machine got its own motor, not possible under steam power).