Usage-Based Billing for AI Tokens: The Drug Dealer Playbook – Targeting Enterprises

I’ve been saying this shift was inevitable for months. As AI moved from simple chat completions to full agentic workflows—long-running sessions, repo-wide reasoning, multi-step coding, massive contexts—the old flat subscription models became unsustainable for the providers. They got us hooked on the productivity gains at low, predictable prices. Now the meter is running, and the real cost is landing on users and businesses.

This isn’t just “aligning pricing with usage.” It’s classic enshittification: subsidize aggressively to drive adoption and dependency, then flip to usage-based billing (UBB) once the habit is formed. The result? Tokens under pure UBB or Enterprise plans are dramatically more expensive for serious use than the personal subscription tiers. It’s price gouging dressed up as “flexibility,” and it gatekeeps advanced AI behind corporate budgets. That stifles exactly the broad experimentation and innovation that made these tools explosive in the first place.

GitHub Copilot’s June 1, 2026 UBB Transition: Many Businesses Are Now Feeling It

GitHub announced in late April that all Copilot plans would move to usage-based billing effective June 1, 2026. Premium Request Units (PRUs) are gone. In their place: GitHub AI Credits (1 credit = $0.01).

  • Copilot Pro: $10/month includes 1,000 credits
  • Copilot Pro+: $39/month includes 3,900 credits
  • Business: $19/user/month includes 1,900 credits (pooled)
  • Enterprise: $39/user/month includes 3,900 credits (pooled)

Base seat prices stayed the same. Code completions and Next Edit Suggestions remain unlimited and free. Everything else—chat, agentic features, reviews—now consumes credits based on actual tokens (input + output + cached) at rates that track the underlying model APIs (e.g., Claude Sonnet-class output often ~1,500 credits per million tokens / $15).

The “before” vs “now” reality for many organizations: Previously, you paid a flat per-seat fee and got a quota of requests that felt reasonably generous for mixed use. Heavy but not insane agentic work often stayed within limits or had softer fallbacks. Now the included credits are a small bundle (often just a few days of serious use for power users or teams), after which you pay full freight or hit admin-set budgets/throttles.

Developers and admins are reporting exactly what I expected: normal interactions burning 10–100+ credits, complex agent sessions or long-context work torching hundreds to thousands in one go. One widely shared example showed a single meaningful request consuming 822 credits. Reddit megathreads and Hacker News threads are full of “this feels like a price increase in disguise,” “you get less for the same money,” “credits vanish in a day of real work,” and people canceling or moving to direct API + tools like Cursor to escape the middleman markup and unpredictability.

Many businesses that were happily on flat Business/Enterprise seats are now effectively on UBB—either buying extra credits or restricting usage. The “sustainable for all users” framing from GitHub doesn’t change the math for heavy/agentic workloads that are becoming the norm.

Claude: Personal Max Subscriptions Are Way Cheaper Than Enterprise UBB — Here’s the Math

This is the clearest steelman of the problem.

Personal plans (claude.ai / apps):

  • Pro: ~$20/month ($17 annual)
  • Max 5x: $100/month
  • Max 20x: $200/month

These deliver high (or very high) usage limits, priority access, advanced features, and Claude Code. On the platform itself you also get favorable caching treatment in many cases. Cost is capped and predictable even if you push it hard. Real users (including heavy devs) report that Max 5x often covers aggressive daily/weekly use without hitting walls, and the effective per-token cost is subsidized by the subscription model. I am a MAX user, this is what I use for my personal projects, this is predicable, fair – but I will openly admit, it feels VERY VERY cheap for what I am getting, the value is insane.

Enterprise / Team plans:

  • Seat fee (roughly $20–30+/user/month or custom, often annual) plus every token metered at full Anthropic API rates on top.
  • No big included usage bundle in most cases. You pay the API rates (plus any cache write costs) for chat, Claude Code, agents, etc.

API rates (Sonnet-class, current): ~$3 / $15 per million input/output tokens (Opus higher at $5 / $25; Haiku much cheaper). Cache writes add extra. Prompt caching and batch help, but the meter still runs on serious volume.

Realistic comparison (Sonnet-class equivalent, approximate June 2026):

Usage ScenarioEst. Monthly Tokens (in/out)Personal Max Cost (flat)Enterprise Est. Cost (seat + tokens)Personal is ~X Cheaper
Light (casual/pro)~5M / 1M$20 (Pro) or $100~$25 + ~$18 = ~$432x+
Medium (power user)~30M / 8M$100 (Max 5x)~$25 + ~$210 = ~$235~2.3x
Heavy (agentic/dev work)~150M / 40M$200 (Max 20x)~$25 + ~$1,050 = ~$1,075~5x
Extreme (high-volume/team)500M+ / 100M+$200 cap~$25 + $3,500+15–20x+

These aren’t theoretical. Users have posted concrete examples of burst/heavy months where API-equivalent cost hit $6k+ while their Max subscription stayed in the low hundreds. The personal Max tiers effectively cap your exposure. Enterprise UBB does the opposite—it scales linearly with every token the agents and long contexts consume.

“Before” context: Early agentic experiments and lighter workloads could sometimes ride on more generous included usage or experimental pricing. As capabilities (and token burn) exploded, providers standardized on metered models that maximize revenue from the heaviest users.

(The chart above is illustrative of the divergence. Personal Max stays relatively flat thanks to high limits and the subscription model. Enterprise UBB rises sharply with volume because every token is charged at full API rates. Real numbers vary with caching, model choice, prompt efficiency, and exact seat pricing.)

This Is Gatekeeping Innovation

Yes, inference hardware and energy have real costs. But efficiency is improving fast, and the pricing shift isn’t primarily about passing through savings—it’s about capturing more of the value now that users are locked in and workloads are exploding with agents.

The practical effect: serious, iterative, agentic AI work becomes reliably affordable only for companies with big budgets who can either negotiate custom deals or absorb surprise bills. Individuals, indie hackers, small teams, researchers, and startups get priced out or forced into constant cost anxiety. That directly reduces the surface area for experimentation and unexpected breakthroughs. The same tools that were democratizing coding and reasoning a year ago are now quietly recentralizing power.

It’s the worst kind of rent-seeking: hook the ecosystem on accessible tools, then flip the switch once dependency is high.

Louis Rossmann Called It (and So Did I)

YouTuber Louis Rossmann recently lost it on Anthropic in a video titled “Anthropic is worse than my high school’s drug dealer” He highlighted sneaky billing gotchas (extra charges triggered by certain files without clear notice), testing users’ willingness to pay more by restricting previously included capabilities on base plans, confusing docs/UI, and rapid destruction of goodwill. His analogies landed exactly on the drug-dealer / predatory-ISP vibe: get people comfortable and productive, then start metering and testing how much more you can extract. I watched that one and was yelling at the screen because it was exactly what my article back on May 5 was saying

The Copilot move follows the same pattern. Widespread community reaction (Reddit, HN, etc.) echoes it: “Silicon Valley playbook—subsidize to dominate, then raise prices once everyone’s dependent.” “Bait and switch.” “Unpredictable costs that kill the joy of just building.”

I’m not shocked. I’m annoyed because it was predictable, and because the execution prioritizes short-term extraction over long-term ecosystem health.

Bottom Line & What You Can Actually Do

Usage-based billing for frontier AI tokens is here for real workloads on the major platforms. Personal subscription tiers (especially Claude Max) remain the best value for heavy individual or small-team use because they cap the damage – for now I promise you changes are coming. Once you’re pushed into pure Enterprise UBB or post-included-credit Copilot territory, the economics flip hard against you.

Practical steps:

  • Audit your actual usage now (GitHub has preview tools; Claude usage dashboards exist).
  • Optimize ruthlessly: right-size models (Haiku for simple stuff), aggressive caching, shorter focused contexts, prompt discipline. Avoid opus for coding!
  • For individuals/power users: Max-tier personal subs or direct API + lean tools are often cheaper than Enterprise.
  • For teams: Set hard budgets/spend alerts immediately. Negotiate. Consider hybrid (personal subs where possible + governed Enterprise).
  • Long-term: Watch open/local models and alternative platforms. The current model creates a massive incentive for better options.

I saw this coming when agentic usage started eating compute. The “drug dealer” phase—get everyone hooked on the cheap high—worked exactly as designed. Now we’re in the “pay up or slow down” phase.

This approach might maximize revenue for a few quarters. It will not maximize innovation across the board. And that’s the part that actually pisses me off.

If you’re running real workloads on these tools, track your numbers closely this month. The difference between “still feels like a good deal” and “what the hell just happened to my bill” is smaller than most people realize—until it isn’t.

I just don’t want the fun to be over.

Leave a comment