Guide

Opus 4.8 vs Sonnet 5 vs Haiku 4.5: Which Claude Model to Use (2026)

A practical guide to choosing between Claude Opus 4.8, Sonnet 5, and Haiku 4.5 for professional work in 2026 — plus where Fable 5 fits. With profession-specific recommendations for the most common workflows.

May 20, 202610 min read

TL;DR. Sonnet 5 (launched June 30, 2026) is now the default Claude model and the right choice for most professional work — it matches Opus 4.8 on most knowledge-work tasks at lower cost. Pick Opus 4.8 for the hardest code and long agentic chains, Haiku 4.5 for speed and scale, and Fable 5 (restored July 1) when you want the most capable generally available model and your plan's Fable usage covers it.

Anthropic's current generally-available lineup as of July 2026: Sonnet 5 (the new default), Opus 4.8, Haiku 4.5, and — restored on July 1 after a 19-day government-ordered suspension — Fable 5, a Mythos-class model that sits above the Opus tier (see our Claude Fable 5 explainer for the restoration story and its credits-based billing). The three workhorse tiers share a common feature surface (text + image input, vision, tool use, Skills, Projects) but differ meaningfully on capability, speed, cost, and context window. Picking the right one for the right task is still the highest-leverage tooling decision most professionals make in 2026 — but Sonnet 5 just made the default answer much better.

This guide does the translation. Specs cited come from Anthropic's model documentation as of July 2026.

The three workhorse models at a glance

	Opus 4.8	Sonnet 5	Haiku 4.5
Position	Strongest on hard code + agentic chains	The default; matches Opus on most knowledge work	Fastest; near-frontier intelligence
API ID	`claude-opus-4-8`	`claude-sonnet-5`	`claude-haiku-4-5-20251001`
Context window	1M tokens	1M tokens	200K tokens
Max output	128K tokens	128K tokens	64K tokens
Pricing (API)	$5/M input, $25/M output	$2/$10 intro through Aug 31, 2026 — then $3/$15	$1/M input, $5/M output
Latency	Moderate	Fast	Fastest
Thinking	Adaptive, effort-controlled	Adaptive, effort-controlled (no manual extended-thinking mode)	Extended thinking (manual)
Reliable knowledge cutoff	Jan 2026	Jan 2026	Feb 2025

The big structural change with Sonnet 5: it adopts the same effort-controlled adaptive thinking as Opus 4.8 (low / medium / high / xhigh / max, defaulting to high). The old Sonnet 4.6-era "Extended vs Adaptive thinking" distinction is gone — manual extended thinking isn't supported on Sonnet 5 at all.

The single most important question: are you on claude.ai or the API?

If you're on claude.ai (the consumer/team chat product), Sonnet 5 has been the default for Free and Pro users since July 1, 2026 — and for the first time, the default is genuinely the right model for most professional work. Pro ($17/mo annual or $20/mo monthly) and above get the full picker. Switch freely as the task changes — there's no per-token cost to you, just usage limits per plan.

If you're using the API (or Claude Code, which uses the API tier you're on), each request costs by token. Sonnet 5's introductory pricing ($2/$10 through August 31, 2026) makes it the obvious default there too — with one caveat: its updated tokenizer produces 1.0–1.35x more tokens for the same content than Sonnet 4.6 did, so real-world costs run slightly higher than the headline rate suggests.

If you just want one default: leave it on Sonnet 5. The rest of this guide is about the exceptions.

Pick Sonnet 5 for: almost everything (it's the default for a reason)

Sonnet 5 collapsed most of the old "when to pay for Opus" calculus. On real-world knowledge-work benchmarks it now edges out Opus 4.8 (GDPval-AA v2: 1,618 vs 1,615), and it leads on computer-use tasks (OSWorld-Verified: 81.2%). In practice that means:

Client / borrower / customer communication — the daily back-and-forth, at speed
Long-form structured documents — memos, disclosure drafts, PRDs, case analyses. This used to be Opus territory; Sonnet 5 handles most of it at or above Opus quality
Long-document analysis — full case files, medical histories, board packages, policy documents. 1M context, same as Opus
Multi-step workflows — Sonnet 5's agentic improvements are the headline of the release; most chained workflows no longer need Opus
Iteration and polishing — fast enough to think alongside you

Profession-specific Sonnet 5 wins

Loan officers running the four-audience pipeline update workflow daily across 8+ active loans
Real estate agents generating listing descriptions, client emails, CMAs, market updates
Attorneys and paralegals drafting memos and summaries from full source documents — verify the hardest analyses on Opus if the stakes demand it
Copywriters and community managers doing iteration-heavy, voice-sensitive work
Most healthcare clinicians drafting SOAP notes, treatment plans, patient education
Management consultants synthesizing meeting notes into strategy memos
AI product managers structuring specs and rollout plans

Pick Opus 4.8 for: the hardest code and the longest agentic chains

Opus 4.8 is no longer the automatic "quality matters" choice — Sonnet 5 took most of that ground. Where Opus still clearly earns its 2.5x price:

Complex software engineering — Opus 4.8 leads Sonnet 5 meaningfully on SWE-bench Pro (69.2% vs 63.2%). For hard multi-file work in Claude Code, Opus is still the model
The longest multi-step agentic workflows — compliance automation, data-extraction pipelines, anything that chains dozens of steps where a single early error compounds
Second opinions on high-stakes deliverables — when the document goes to a regulator, court, or board and you want the strongest classic-tier model to draft or check it

Two practical notes. First, Fast mode costs $10/M input and $50/M output — much cheaper than it was on Opus 4.6/4.7 — so Opus depth with low latency carries a smaller premium. Second, the effort parameter defaults to high; set it lower for faster, cheaper responses on simpler work.

Profession-specific Opus 4.8 wins

Data scientists and technical operators running hard multi-step pipelines in Claude Code
Healthcare compliance officers running QSR gap audits across full DHF documents where miss-cost is extreme
AI compliance officers producing pre-legal regulatory screens across EU AI Act tiers + Annex III + US state overlays
ESG sustainability analysts mapping KPIs to multiple frameworks from a 200-page sustainability report

Pick Haiku 4.5 for: speed and scale

Haiku 4.5 is the fastest current Claude model and the cheapest. It's also more capable than people expect — "near-frontier intelligence" per Anthropic's framing. The places it earns its place:

High-volume classification or extraction tasks — running over thousands of inputs per day
Real-time chat surfaces — when the response needs to feel instantaneous (in-product chatbots, customer support assistants)
Background AI features inside production applications — the AI that runs invisibly behind a feature, where latency directly affects user experience
Cost-sensitive workflows where Sonnet's depth isn't worth the price — at $1/$5, Haiku is the cheapest current model even against Sonnet 5's intro pricing

Haiku's constraints to know:

200K context window, not 1M. Long documents need chunking
Manual extended thinking only — no effort-controlled adaptive thinking
Reliable knowledge cutoff is Feb 2025 — older than Sonnet 5 and Opus (both Jan 2026). For questions about events in 2025 or 2026, Haiku may have stale knowledge

Profession-specific Haiku 4.5 wins

Customer support functions building in-product AI chat where latency matters
Recruiters / HR running high-volume resume classification or initial screening (with appropriate human-in-the-loop and EEOC-aware guardrails — see our recruiter audit guide)
Sales teams generating high-volume personalized outreach where the template is the value-add and depth matters less
Internal tooling — Slack bots, internal knowledge search, the AI behind ops dashboards
Real-time content moderation in community management workflows

Where does Fable 5 fit?

Fable 5 — restored globally on July 1, 2026 after a 19-day suspension — is a Mythos-class model that sits above the Opus tier entirely (SWE-bench Pro: 80.3% vs Opus 4.8's 69.2%). It is the most capable Claude model generally available. The catch is billing: through July 7 it's included for up to 50% of weekly usage limits on Pro, Max, Team, and select Enterprise plans; after that it moves to credits-based billing, and API use requires accepting a 30-day data-retention term for safety monitoring.

Practical rule: treat Fable 5 as the escalation above Opus 4.8 — the model for the sessions where capability is worth paying for separately. For the full restoration story, billing tiers, and the data-retention caveat, see our Claude Fable 5 explainer.

The decision tree

If you only remember one decision rule, use this:

Is it everyday professional work — drafting, analysis, client comms, documents, most workflows? → Sonnet 5. The default is now the right answer
Is it genuinely hard code, or a long agentic chain where early errors compound? → Opus 4.8
Is latency the user-facing experience, or is this running at thousands-per-day volume? → Haiku 4.5
Is this the rare session where you want the most capable model available and your plan's Fable usage (or credits) covers it? → Fable 5

When to switch mid-conversation

On claude.ai you can switch models mid-conversation. The pattern that works for serious work in the Sonnet 5 era:

Stay in Sonnet 5 for thinking, drafting, iterating — and, unlike the Sonnet 4.6 days, for most final deliverables too
Switch to Opus 4.8 when you hit genuinely hard code or a long multi-step agentic task
Switch to Fable 5 for the occasional session where maximum capability is worth it and your plan allows

The old discipline — "draft in Sonnet, finish in Opus" — is mostly obsolete. Sonnet 5's knowledge-work quality means the finish-in-Opus step now only pays for itself on the hardest technical work.

Effort levels: how Opus 4.8 and Sonnet 5 tune depth

Both Opus 4.8 and Sonnet 5 take an effort parameter that controls how much the model deliberates before answering — low, medium, high (the default), xhigh, and max. Lowering it trades depth for speed and cost; xhigh suits long-running agentic and coding sessions.

The question that used to come up constantly — "Opus on low effort vs Sonnet on high?" — has a cleaner answer now:

For knowledge work, Sonnet 5 at default effort is simply the right tool. It benchmarks at Opus level and costs less.
For hard code and long agentic chains, Opus 4.8 wins on base capability — use it at high (or drop to medium when you want most of the quality at lower latency).
Rule of thumb: pick the model by the task's type (knowledge work → Sonnet 5; hard technical work → Opus 4.8), then use effort to tune speed and cost within that model.

(Haiku 4.5 is the exception: it exposes manual Extended Thinking, which you invoke explicitly, rather than effort-controlled adaptive thinking.)

What about the legacy models?

Anthropic still publishes documentation for Sonnet 4.6, Opus 4.7, Opus 4.6, Sonnet 4.5, Opus 4.5, Opus 4.1, and the original Claude 4 models. Claude Sonnet 4 (claude-sonnet-4-20250514) and Claude Opus 4 (claude-opus-4-20250514) are deprecated and retired as of June 15, 2026.

Sonnet 4.6 is now a legacy model. If your integration pins claude-sonnet-4-6, update it to claude-sonnet-5 — you get a better model at (through August 31) a lower rate, with the tokenizer caveat above. The other legacy models remain available but are not recommended for new integrations. If you're on Opus 4.7, moving to Opus 4.8 is a same-price upgrade ($5/M input, $25/M output).

How API model versioning works

One subtle change starting with Claude 4.6: model IDs are pinned snapshots, not evergreen pointers. claude-sonnet-5 won't auto-upgrade when Anthropic ships the next Sonnet — it stays on this snapshot, exactly as claude-sonnet-4-6 stayed put when Sonnet 5 launched. This is a real change from earlier versioning patterns where some aliases moved.

For production integrations, this is the right behavior: you don't want your model silently changing under you. For staying current, it means actively migrating model strings when new versions ship — as everyone pinned to claude-sonnet-4-6 should be doing right now.

Pricing context

Per-token pricing as of July 2026 from Anthropic's documentation:

Opus 4.8: $5/M input, $25/M output (Fast mode: $10/$50)
Sonnet 5: introductory $2/M input, $10/M output through August 31, 2026; standard $3/$15 from September 1. Updated tokenizer produces 1.0–1.35x more tokens for the same content vs Sonnet 4.6
Haiku 4.5: $1/M input, $5/M output

For most professionals working through claude.ai rather than the API, per-token pricing is academic — usage limits are per-plan, not per-token. But if you're using Claude Code (which uses your API tier), or you're integrating Claude into your own product, the per-token pricing is where the per-model cost calculus lives.

Anthropic's pricing page has the current consumer plan details. Verify before committing to a plan.

Bottom line

Leave claude.ai on Sonnet 5 — the default is finally the right answer for most professional work. Reach for Opus 4.8 when the task is genuinely hard code or a long agentic chain, Haiku 4.5 when speed or volume is the point, and Fable 5 for the rare session where maximum capability justifies its separate billing.

The era of "just use Opus for everything" is over — Sonnet 5 matches it on most knowledge work at a fraction of the cost. Pick the model for the task, not the task for the model.

For the deeper dive on the top of the lineup, see our Claude Fable 5 explainer and the Claude Sonnet 5 launch write-up. For Claude vs ChatGPT, see our profession-specific comparison hub.

This article cites model specifications as published in Anthropic's model documentation and pricing as published at claude.com/pricing as of July 2026. Anthropic updates model availability, capabilities, and pricing frequently. Verify current state before procurement or integration decisions.

Free · 60 seconds · No credit card

Curious where AI actually fits your job?

Answer a few questions and get a free, personalized 30-day AI plan for your exact role — the tasks to automate first, and the prompts to do it.

Find my AI wins

Free · 60 seconds

Where does AI fit your job?

Get a free 30-day AI plan built for your role — no credit card.

Find my AI wins

Frequently asked questions

Is Opus 4.8 better than Sonnet 5?+

It depends on the task — and as of Sonnet 5 (June 30, 2026), the honest answer is "less often than you'd think." Sonnet 5 matches or beats Opus 4.8 on real-world knowledge-work benchmarks (GDPval-AA v2: 1,618 vs 1,615) while costing less per token. Opus 4.8 still leads on complex software engineering (SWE-bench Pro: 69.2% vs 63.2%) and the hardest multi-step agentic work. Default to Sonnet 5; reach for Opus 4.8 when the task is genuinely hard code or a long agentic chain.

What is the difference between Opus 4.8, Sonnet 5, and Haiku 4.5?+

Opus 4.8 is the most capable of the classic line (1M-token context, best for the hardest code and agentic work). Sonnet 5 is the new default workhorse (1M context, 128K output, matches Opus on most knowledge work, cheaper). Haiku 4.5 is the fastest and cheapest (200K context, best for high-volume and real-time tasks). API pricing per million input/output tokens: Opus $5/$25, Sonnet 5 $2/$10 intro through Aug 31 2026 (then $3/$15), Haiku $1/$5.

Do I still need to switch models on claude.ai?+

Much less than before. Since July 1, 2026, Sonnet 5 is the default for Free and Pro users, and it handles most professional work — including a lot of what used to need Opus. Switch to Opus 4.8 for the hardest coding and long agentic sessions, or to Fable 5 (restored July 1) when you want the most capable generally available model and your plan's Fable usage allows it.

Which Claude model should I use on claude.ai?+

Leave the model on Sonnet 5 — it's the default for a reason. Switch to Opus 4.8 for genuinely hard code or long multi-step agentic work. Pro ($20/month, or $17/month billed annually) and above unlock the full picker, and there is no per-token cost on claude.ai — just per-plan usage limits — so switching costs you nothing when the task warrants it.

How much does each Claude model cost?+

API pricing as of July 2026: Opus 4.8 — $5/M input, $25/M output (Fast mode $10/$50); Sonnet 5 — introductory $2/M input, $10/M output through August 31, 2026, then $3/$15 (note: Sonnet 5's updated tokenizer produces 1.0–1.35x more tokens for the same content); Haiku 4.5 — $1/M input, $5/M output. On claude.ai you pay per plan, not per token: Free (Sonnet 5 plus limited Haiku), Pro $20/month ($17 annual), Max $100–$200/month, Team $25–$125/seat, Enterprise custom.

What happened to Claude Fable 5 — can I use it now?+

Yes. Fable 5 launched June 9, 2026, was suspended June 12 under a US government export-control directive, and was restored globally on July 1. Through July 7 it's included for up to 50% of weekly usage limits on Pro, Max, Team, and select Enterprise plans; after July 7 it moves to credits-based billing. It's Anthropic's most capable generally available model — a Mythos-class tier above Opus. See our Fable 5 explainer for the full story and billing details.

Topics:Claude

By Alex LoweReviewed by Alex LowePublished May 20, 2026Updated July 3, 2026

Related Guides

Guide

We Built an MCP Server That AI Agents Pay — the Full x402 Loop, Verified On-Chain

A field report on x402 agent payments: per-call USDC pricing on MCP tools, the client-side payment loop nobody documents, and seven gotchas from getting real money to settle on-chain.

Jul 2, 202612 min read

Guide

Vibe Coding Mistakes: 12 Ways AI-Generated Apps Break in Production (and How to Fix Each)

Vibe coding produces great first screens and fragile internals. Here are the 12 failure modes AI-generated apps hit in production — leaked keys, unvalidated tool inputs, silent mutations, package bloat — each with a concrete fix.

Jul 1, 202611 min read

Guide

Best AI Coding Stack for Solo Founders (2026)

The exact AI coding stack to buy as a solo founder: one $20 agent, free hosting and database to start, and a clear rule for when to upgrade — without paying for overlapping tools.

Jun 30, 20269 min read