Forty Years. Three Metrics. None of Them Work.

A new word is making the rounds in Silicon Valley: tokenmaxxing.

It describes the practice of maximizing your AI token consumption – to prove you’re working. A token is the basic unit of inference: roughly a word, what an agent consumes when it acts or writes. For two years, it was an engineering detail. Today, it’s an HR metric.

The concrete examples aren’t hard to find. At Meta, in early April, an employee published an internal ranking of all 85,000 employees by token consumption. The top user burned 281 billion tokens alone – at least $1.4 million worth. Jensen Huang, Nvidia’s CEO, framed the logic plainly: “If a $500,000 engineer isn’t consuming at least $250,000 a year in tokens, I would be deeply worried.”

When I read that, I recognized something.

Not an innovation. A trap. A trap I’ve watched close three times in forty years.

In the 1980s, we counted lines of code. The good developer was the one who produced the most. We discovered quickly that the best engineers often wrote less – because they solved problems with precision rather than volume.

In the 2000s, we counted man-days and day rates. Consulting firms and IT services companies thrived on that logic: bill time, not outcomes. The longer a project ran, the more it paid. The metric rewarded inefficiency.

Today, we count tokens.

Same mistake. Same mechanics. Same blind faith in the easy number.

Each time, the metric spreads fast because it’s measurable, not because it’s right. And each time, it ends up serving whoever sells it: yesterday the IT services firms billing time, today Nvidia and the AI labs billing tokens.

That’s not a coincidence. The metrics that dominate are rarely the ones that best measure value. They’re the ones that align with the economic interests of the most powerful actor in the chain.

Jensen Huang isn’t doing bad math. He’s asking the wrong question – or rather, he’s asking exactly the right question for Nvidia.

The technical problem is real, and it has a name: context rot.

An overloaded context window doesn’t produce better results. It produces a model that loses the thread. Too many files, autonomous sessions that drift, tools loaded unnecessarily – all of it degrades the coherence of the reasoning. The heavier the window, the more the AI unravels.

An agent consuming massive amounts of tokens can mean two very different things: dense, well-targeted work, or a drifting session that a human will spend hours cleaning up. Raw consumption doesn’t tell you which.

Reid Hoffman, LinkedIn’s co-founder and a tokenmaxxing advocate, conceded as much himself: “it’s not a perfect example of productivity.”

Not a perfect example. A false one.

A manager who wants to measure productivity in the age of agents needs to first understand what a context is – how it’s built, how it degrades, and what separates a productive session from one that’s spinning in circles.

Without that understanding, they won’t measure what they think they’re measuring. They’ll pay for volume. And they’ll call it performance.

Forty years. Three metrics. None of them work.

The real question hasn’t changed: what was actually accomplished, and at what cost? Everything else is noise – billable noise.