GitHub just put a meter on Copilot.
Usage beyond a monthly threshold now triggers per-request charges. The announcement is specific to one product, but the signal is industry-wide: the flat AI subscription model and the real economics of inference are starting to collide.
Other vendors will follow. The only question is timing.
For a while, AI felt like SaaS. A predictable monthly line item, a known cost per seat, a subscription you could budget without thinking too hard. That framing was always wrong – it was just temporarily subsidized.
AI is not software. Software has near-zero marginal cost once built. You can add a million users to a SaaS product and the infrastructure cost barely moves.
AI is compute. Every prompt consumes it. Every agentic workflow consumes more. Every long context window, every repository scan, every multi-step reasoning task has a real infrastructure cost behind it – GPUs spinning, energy flowing, inference running. That cost doesn’t flatten with scale. In many cases, it accelerates.
What vendors sold as subscriptions were really inference budgets with a ceiling nobody disclosed. When usage grew past what the pricing model assumed, the ceiling had to move. That’s what metered pricing is: the moment the subsidy ends.
This has happened before, in every infrastructure category that went through a similar transition.
Cloud computing started with the promise of cheap, simple pricing. Then came reserved instances, spot pricing, egress fees, tiered storage classes, per-request charges on managed services. The economics of compute always reassert themselves eventually. The initial pricing model exists to drive adoption. The sustainable model reflects actual costs.
AI is following the same arc, just faster.
The difference with AI is the variance. A developer browsing a SaaS dashboard costs roughly the same as any other developer. A developer running an agent that scans an entire codebase, generates tests, reviews pull requests, and iterates on architecture costs orders of magnitude more than one who fires off a few completions. Flat pricing can’t survive that variance at scale.
So the strategic question shifts.
It’s no longer “which AI tool should we buy.” That question assumes AI is a product. It isn’t – it’s infrastructure with a usage curve.
The real question becomes: which workloads should run on external platforms, and which ones should we own?
That distinction matters because not all workloads are equal. Some are exploratory, low-volume, variable – good candidates for metered external services where you pay for what you use and carry no fixed cost. Others are repetitive, high-volume, sensitive, or operationally critical. For those, every inference you send to an external platform is a unit economics decision you’re making by default, usually without realizing it.
This is where private AI stops being a slogan and becomes an architecture decision.
Not because everything should run locally. On-premise AI has real costs: hardware, maintenance, model management, infrastructure expertise. Those aren’t trivial.
But for the right workloads, private AI converts AI cost from a variable Opex exposure – a bill that grows with usage and is controlled by someone else’s pricing team – into a controllable Capex decision. You buy the infrastructure. You set the capacity. You own the unit economics.
You also own the data. For sensitive workloads, that’s not a privacy checkbox. It’s a compounding strategic asset. Every inference that stays inside your walls teaches your systems more about how your organization operates. That context doesn’t transfer to your competitors. It doesn’t appear in someone else’s training data. It accumulates where you can use it.
The companies that figure this out early won’t pick a side between cloud AI and private AI. That’s the wrong frame. They’ll build a tiered architecture: external platforms for exploratory, variable, or commodity workloads; owned infrastructure for the workloads where volume, sensitivity, or criticality make ownership the rational choice.
That’s not an exotic decision. It’s the same build-versus-buy logic that has governed enterprise infrastructure for fifty years. AI just makes the stakes higher and the variance larger.
The ones who don’t make this call deliberately will still make it – by default, at quarter end, when the invoice arrives and someone has to explain why the AI budget is three times what was forecast.
GitHub’s meter is a small technical change to one product’s billing model.
What it signals is that the inference economy is maturing, and the comfortable pricing assumptions of the adoption phase are ending.
Build the architecture before the bill does it for you.