Anthropic's Fast Mode: New AI Pricing Strategies Explained

Welcome back to Good Better Best.

Each week, we break down real pricing, packaging, and product moves from SaaS and AI leaders and share the ideas worth stealing.

This week, we’re breaking down Anthropic’s new Fast Mode feature, which will almost certainly have ripple effects throughout top SaaS and AI players in the coming months.

Before we get there, two hiring opportunities across the PricingSaaS ecosystem:

Teneo is actively hiring for its Software Monetization team across all levels in NYC, Boston, Chicago, Washington DC, and San Francisco.
Willingness to Pay is actively hiring for a Senior Advisor based in the US to support American client work.

If you’re interested in either role, shoot me a note at [email protected], and I’ll put you in touch.

🔌 PricingSaaS Partners power the next era of SaaS pricing

This Week in Pricing, Packaging, and Product

This week we observed 150+ changes. The highlights:

Aftershoot launched Galleries with 100GB free storage | Diff Link
Amplitude launched Zoning Insights with AI recommendations | Diff Link
Linear expanded Triage with Diffs and beta Agent limits | Diff Link
Squarespace slashed prices up to 30% across all subscription tiers | Diff Link
Anthropic API launches Opus 4.8 with new fast mode | Diff Link
LogRocket added a Surveys feature | Diff Link
Hugging Face upgraded free GPUs | Diff Link
Veed raised the Pro plan price to $24/month | Diff Link
Similarweb rebranded pricing tiers from Packages to Plans | Diff Link
GitHub cut 50 monthly chat requests from the Copilot free tier | Diff Link

Check out more updates on PricingSaaS →

PricingSaaS Pulse Intelligence

Here’s what was top of mind in Pulse this week:

🔥 Hot Companies

Asana — 35 lookups
Notion — 35 lookups
Hootsuite — 29 lookups
Linear — 26 lookups
Airtable — 25 lookups

🚨 Hot Topics

How to choose between per-seat and usage-based pricing for AI features
Validate price with customer conversations — common mistakes
How to ask customers what they're willing to pay (interview techniques)
Mid-market tier pricing strategy for SMB
How to decide whether to bundle AI features into existing plans or price separately as an add-on

Start Using PricingSaaS Pulse for free

Fast Mode: Anthropic’s New Pricing Lever

When Anthropic launched Opus 4.8 last week, the model was the main headline. But the interesting story isn't the model — it's that they used the launch to quietly cut Fast Mode pricing by two-thirds and start migrating customers off the original version.

The updated pricing and new real estate on the pricing page suggest they’ve worked out the kinks, and are ready to bring Fast Mode to a wider audience.

What Fast Mode actually is

Fast Mode is the first time Claude has decoupled speed from intelligence as a separately priced product. It's the same Claude model, same intelligence, same answers — the model just produces its response 2.5x faster. You turn it on by flipping a setting when you make a request. The speed improvement is specifically in how quickly Claude writes out its answer, not in how long it takes to start replying.

That distinction matters. Fast Mode is built for things where Claude is doing a lot of writing — code generation, agents that take multiple steps, long analytical answers. Anthropic's own example in the docs is "Refactor this module to use dependency injection." They're telling you exactly who this is for: AI coding tools and anyone running agents that work for a long time.

How Fast Mode is Priced

This is where it gets interesting. Fast Mode is priced as a straight per-model multiplier on standard rates:

Model	Standard Input/Output	Fast Mode Input/Output	Multiplier
Claude Opus 4.8	$5 / $25 per MTok	$10 / $50 per MTok	2x
Claude Opus 4.7	$5 / $25 per MTok	$30 / $150 per MTok	6x
Claude Opus 4.6	$5 / $25 per MTok	$30 / $150 per MTok	6x (deprecated)

Two things jump off this table.

Fast Mode is dramatically cheaper on the newest model. If you were already paying the 6x premium on Opus 4.7, moving to 4.8 cuts your speed bill by two-thirds. Fast Mode on Opus 4.6 is being shut down entirely in about 30 days — at which point those requests will quietly drop back to normal speed. It’s clear they’re using Fast Mode to drive adoption to the newest model.
Paying 2x for 2.5x faster is actually a good deal if speed is your bottleneck. You're getting 25% more output per dollar of spend. That math only works if your business is throughput-constrained — which is exactly the customer Anthropic is going after.

Fast mode also stacks with other multipliers, a detail that’s easy to miss. Anthropic has been quietly building a multi-layered pricing system, and Fast Mode is the newest layer. There are a few others:

Prompt caching has its own multipliers on the input side.
Data residency adds another 1.1x to everything.
Tool use has fixed per-call token overheads (290 to 804 tokens depending on the model and tool choice).
Batch API cuts everything in half — but is explicitly not compatible with Fast Mode.

So a "real" production cost on Opus 4.8 isn't $5/$25. It's whatever you get after you multiply through your speed choice, your cache strategy, your residency requirement, and your tool overhead.

Example: A 1-hour cache write input on Opus 4.8 with Fast Mode and US-only residency is $5 × 2 (fast) × 2 (1hr cache write) × 1.1 (US) = $22/MTok input (versus $5 on the base sticker). That's a 4.4x effective multiplier, and it's all conditional on choices the customer makes per request.

Anthropic has essentially built a configurable pricing surface where the buyer co-designs their own SKU at runtime. That's a lot of optionality, and a lot of room for surprise bills.

Why Anthropic launched this now

Three reasons, all of which feel obvious in hindsight.

They need a way to charge more without raising the headline price. Claude's top-tier model has gotten a lot cheaper over the past year. Opus 4.1 used to cost $15 in and $75 out per million tokens. Opus 4.5, 4.6, 4.7, and now 4.8 are all sitting at $5 and $25 — a 3x price cut while the model has gotten better. Standard pricing is in a race to the bottom. Fast Mode lets Anthropic charge more from the customers who will pay for it (mostly coding tools and agent companies) without backing off their cheap headline number.

They're defending against speed-first competitors. Specialty chip companies like Cerebras and Groq have been winning on raw speed benchmarks for a while, running open-source models at jaw-dropping throughput. Anthropic doesn't want to lose the customers who care about both intelligence and speed. Fast Mode is their way of offering both.

It's a pricing experiment in disguise. Fast Mode is still in research preview. You don't just toggle it on — you either contact your account manager or join the waitlist. And it's only available if you go directly through Anthropic. The Research Preview tag lets Anthropic ship a real, billable product without committing to long-term pricing or service guarantees. It also lets them control supply, see who's willing to pay, and figure out the actual margins before they open the gates.

What this means for SaaS companies building on Claude

If you're a SaaS company that resells Claude inference, Fast Mode just added a new strategic decision to your stack:

1. It's a new tier you can charge for. "Fast" is the cleanest possible upsell — it's a feature your end user can feel without you having to explain anything technical. Expect to see "Fast" or "Turbo" toggles show up across the agentic coding category within the quarter.

2. It's a margin compression risk if you don't pass it through. Any SaaS company that bundles Claude into a flat monthly subscription just inherited a new pricing problem. A heavy user who burns Fast Mode all day will cost you twice as much to serve as a normal-speed user. You either need to cap usage, give the user a button they consciously have to flip on, or — most likely — charge extra for it.

3. It deepens lock-in to Anthropic's first-party API. Fast Mode doesn't work on AWS Bedrock or Google's Vertex. If you were routing through one of those clouds for compliance or contract reasons, Fast Mode is the first feature that gives you a real reason to bypass them. That's deliberate on Anthropic's part — but it's a real headache for enterprise customers who picked their cloud vendor for reasons that had nothing to do with Claude.

The takeaway

Fast Mode is a small change for now, with big implications downstream. Anthropic just turned speed into a billable axis — gated by an account manager today, but obviously headed for general availability.

For SaaS companies building on Claude, the next product decision isn't which model you use. It's which combination of model, speed, cache, residency, and tooling makes sense per workload — and how much of that decision you let your customer make.

Share the newsletter

Thanks for reading! If you’re working on AI monetization and want to learn more about how we help, book time here.

Until next time,

Rob

Fast Mode: Anthropic’s New Pricing Lever

🔌 PricingSaaS Partners power the next era of SaaS pricing

This Week in Pricing, Packaging, and Product

PricingSaaS Pulse Intelligence

Fast Mode: Anthropic’s New Pricing Lever

What Fast Mode actually is

How Fast Mode is Priced

Why Anthropic launched this now

What this means for SaaS companies building on Claude

The takeaway

Reply

Keep Reading

GoodBetterBest