Skip to content
Back to Blog
Guide

Opus 4.8 vs Sonnet 5 vs Haiku 4.5: Which Claude Model to Use (2026)

A practical guide to choosing between Claude Opus 4.8, Sonnet 5, and Haiku 4.5 for professional work in 2026 — plus where Fable 5 fits. With profession-specific recommendations for the most common workflows.

10 min read

TL;DR. Sonnet 5 (launched June 30, 2026) is now the default Claude model and the right choice for most professional work — it matches Opus 4.8 on most knowledge-work tasks at lower cost. Pick Opus 4.8 for the hardest code and long agentic chains, Haiku 4.5 for speed and scale, and Fable 5 (restored July 1) when you want the most capable generally available model and your plan's Fable usage covers it.

Anthropic's current generally-available lineup as of July 2026: Sonnet 5 (the new default), Opus 4.8, Haiku 4.5, and — restored on July 1 after a 19-day government-ordered suspension — Fable 5, a Mythos-class model that sits above the Opus tier (see our Claude Fable 5 explainer for the restoration story and its credits-based billing). The three workhorse tiers share a common feature surface (text + image input, vision, tool use, Skills, Projects) but differ meaningfully on capability, speed, cost, and context window. Picking the right one for the right task is still the highest-leverage tooling decision most professionals make in 2026 — but Sonnet 5 just made the default answer much better.

This guide does the translation. Specs cited come from Anthropic's model documentation as of July 2026.

The three workhorse models at a glance

Opus 4.8 Sonnet 5 Haiku 4.5
Position Strongest on hard code + agentic chains The default; matches Opus on most knowledge work Fastest; near-frontier intelligence
API ID claude-opus-4-8 claude-sonnet-5 claude-haiku-4-5-20251001
Context window 1M tokens 1M tokens 200K tokens
Max output 128K tokens 128K tokens 64K tokens
Pricing (API) $5/M input, $25/M output $2/$10 intro through Aug 31, 2026 — then $3/$15 $1/M input, $5/M output
Latency Moderate Fast Fastest
Thinking Adaptive, effort-controlled Adaptive, effort-controlled (no manual extended-thinking mode) Extended thinking (manual)
Reliable knowledge cutoff Jan 2026 Jan 2026 Feb 2025

The big structural change with Sonnet 5: it adopts the same effort-controlled adaptive thinking as Opus 4.8 (low / medium / high / xhigh / max, defaulting to high). The old Sonnet 4.6-era "Extended vs Adaptive thinking" distinction is gone — manual extended thinking isn't supported on Sonnet 5 at all.

The single most important question: are you on claude.ai or the API?

If you're on claude.ai (the consumer/team chat product), Sonnet 5 has been the default for Free and Pro users since July 1, 2026 — and for the first time, the default is genuinely the right model for most professional work. Pro ($17/mo annual or $20/mo monthly) and above get the full picker. Switch freely as the task changes — there's no per-token cost to you, just usage limits per plan.

If you're using the API (or Claude Code, which uses the API tier you're on), each request costs by token. Sonnet 5's introductory pricing ($2/$10 through August 31, 2026) makes it the obvious default there too — with one caveat: its updated tokenizer produces 1.0–1.35x more tokens for the same content than Sonnet 4.6 did, so real-world costs run slightly higher than the headline rate suggests.

If you just want one default: leave it on Sonnet 5. The rest of this guide is about the exceptions.

Pick Sonnet 5 for: almost everything (it's the default for a reason)

Sonnet 5 collapsed most of the old "when to pay for Opus" calculus. On real-world knowledge-work benchmarks it now edges out Opus 4.8 (GDPval-AA v2: 1,618 vs 1,615), and it leads on computer-use tasks (OSWorld-Verified: 81.2%). In practice that means:

  • Client / borrower / customer communication — the daily back-and-forth, at speed
  • Long-form structured documents — memos, disclosure drafts, PRDs, case analyses. This used to be Opus territory; Sonnet 5 handles most of it at or above Opus quality
  • Long-document analysis — full case files, medical histories, board packages, policy documents. 1M context, same as Opus
  • Multi-step workflows — Sonnet 5's agentic improvements are the headline of the release; most chained workflows no longer need Opus
  • Iteration and polishing — fast enough to think alongside you

Profession-specific Sonnet 5 wins

  • Loan officers running the four-audience pipeline update workflow daily across 8+ active loans
  • Real estate agents generating listing descriptions, client emails, CMAs, market updates
  • Attorneys and paralegals drafting memos and summaries from full source documents — verify the hardest analyses on Opus if the stakes demand it
  • Copywriters and community managers doing iteration-heavy, voice-sensitive work
  • Most healthcare clinicians drafting SOAP notes, treatment plans, patient education
  • Management consultants synthesizing meeting notes into strategy memos
  • AI product managers structuring specs and rollout plans

Pick Opus 4.8 for: the hardest code and the longest agentic chains

Opus 4.8 is no longer the automatic "quality matters" choice — Sonnet 5 took most of that ground. Where Opus still clearly earns its 2.5x price:

  • Complex software engineering — Opus 4.8 leads Sonnet 5 meaningfully on SWE-bench Pro (69.2% vs 63.2%). For hard multi-file work in Claude Code, Opus is still the model
  • The longest multi-step agentic workflows — compliance automation, data-extraction pipelines, anything that chains dozens of steps where a single early error compounds
  • Second opinions on high-stakes deliverables — when the document goes to a regulator, court, or board and you want the strongest classic-tier model to draft or check it

Two practical notes. First, Fast mode costs $10/M input and $50/M output — much cheaper than it was on Opus 4.6/4.7 — so Opus depth with low latency carries a smaller premium. Second, the effort parameter defaults to high; set it lower for faster, cheaper responses on simpler work.

Profession-specific Opus 4.8 wins

  • Data scientists and technical operators running hard multi-step pipelines in Claude Code
  • Healthcare compliance officers running QSR gap audits across full DHF documents where miss-cost is extreme
  • AI compliance officers producing pre-legal regulatory screens across EU AI Act tiers + Annex III + US state overlays
  • ESG sustainability analysts mapping KPIs to multiple frameworks from a 200-page sustainability report

Pick Haiku 4.5 for: speed and scale

Haiku 4.5 is the fastest current Claude model and the cheapest. It's also more capable than people expect — "near-frontier intelligence" per Anthropic's framing. The places it earns its place:

  • High-volume classification or extraction tasks — running over thousands of inputs per day
  • Real-time chat surfaces — when the response needs to feel instantaneous (in-product chatbots, customer support assistants)
  • Background AI features inside production applications — the AI that runs invisibly behind a feature, where latency directly affects user experience
  • Cost-sensitive workflows where Sonnet's depth isn't worth the price — at $1/$5, Haiku is the cheapest current model even against Sonnet 5's intro pricing

Haiku's constraints to know:

  • 200K context window, not 1M. Long documents need chunking
  • Manual extended thinking only — no effort-controlled adaptive thinking
  • Reliable knowledge cutoff is Feb 2025 — older than Sonnet 5 and Opus (both Jan 2026). For questions about events in 2025 or 2026, Haiku may have stale knowledge

Profession-specific Haiku 4.5 wins

  • Customer support functions building in-product AI chat where latency matters
  • Recruiters / HR running high-volume resume classification or initial screening (with appropriate human-in-the-loop and EEOC-aware guardrails — see our recruiter audit guide)
  • Sales teams generating high-volume personalized outreach where the template is the value-add and depth matters less
  • Internal tooling — Slack bots, internal knowledge search, the AI behind ops dashboards
  • Real-time content moderation in community management workflows

Where does Fable 5 fit?

Fable 5 — restored globally on July 1, 2026 after a 19-day suspension — is a Mythos-class model that sits above the Opus tier entirely (SWE-bench Pro: 80.3% vs Opus 4.8's 69.2%). It is the most capable Claude model generally available. The catch is billing: through July 7 it's included for up to 50% of weekly usage limits on Pro, Max, Team, and select Enterprise plans; after that it moves to credits-based billing, and API use requires accepting a 30-day data-retention term for safety monitoring.

Practical rule: treat Fable 5 as the escalation above Opus 4.8 — the model for the sessions where capability is worth paying for separately. For the full restoration story, billing tiers, and the data-retention caveat, see our Claude Fable 5 explainer.

The decision tree

If you only remember one decision rule, use this:

  1. Is it everyday professional work — drafting, analysis, client comms, documents, most workflows?Sonnet 5. The default is now the right answer
  2. Is it genuinely hard code, or a long agentic chain where early errors compound?Opus 4.8
  3. Is latency the user-facing experience, or is this running at thousands-per-day volume?Haiku 4.5
  4. Is this the rare session where you want the most capable model available and your plan's Fable usage (or credits) covers it?Fable 5

When to switch mid-conversation

On claude.ai you can switch models mid-conversation. The pattern that works for serious work in the Sonnet 5 era:

  • Stay in Sonnet 5 for thinking, drafting, iterating — and, unlike the Sonnet 4.6 days, for most final deliverables too
  • Switch to Opus 4.8 when you hit genuinely hard code or a long multi-step agentic task
  • Switch to Fable 5 for the occasional session where maximum capability is worth it and your plan allows

The old discipline — "draft in Sonnet, finish in Opus" — is mostly obsolete. Sonnet 5's knowledge-work quality means the finish-in-Opus step now only pays for itself on the hardest technical work.

Effort levels: how Opus 4.8 and Sonnet 5 tune depth

Both Opus 4.8 and Sonnet 5 take an effort parameter that controls how much the model deliberates before answering — low, medium, high (the default), xhigh, and max. Lowering it trades depth for speed and cost; xhigh suits long-running agentic and coding sessions.

The question that used to come up constantly — "Opus on low effort vs Sonnet on high?" — has a cleaner answer now:

  • For knowledge work, Sonnet 5 at default effort is simply the right tool. It benchmarks at Opus level and costs less.
  • For hard code and long agentic chains, Opus 4.8 wins on base capability — use it at high (or drop to medium when you want most of the quality at lower latency).
  • Rule of thumb: pick the model by the task's type (knowledge work → Sonnet 5; hard technical work → Opus 4.8), then use effort to tune speed and cost within that model.

(Haiku 4.5 is the exception: it exposes manual Extended Thinking, which you invoke explicitly, rather than effort-controlled adaptive thinking.)

What about the legacy models?

Anthropic still publishes documentation for Sonnet 4.6, Opus 4.7, Opus 4.6, Sonnet 4.5, Opus 4.5, Opus 4.1, and the original Claude 4 models. Claude Sonnet 4 (claude-sonnet-4-20250514) and Claude Opus 4 (claude-opus-4-20250514) are deprecated and retired as of June 15, 2026.

Sonnet 4.6 is now a legacy model. If your integration pins claude-sonnet-4-6, update it to claude-sonnet-5 — you get a better model at (through August 31) a lower rate, with the tokenizer caveat above. The other legacy models remain available but are not recommended for new integrations. If you're on Opus 4.7, moving to Opus 4.8 is a same-price upgrade ($5/M input, $25/M output).

How API model versioning works

One subtle change starting with Claude 4.6: model IDs are pinned snapshots, not evergreen pointers. claude-sonnet-5 won't auto-upgrade when Anthropic ships the next Sonnet — it stays on this snapshot, exactly as claude-sonnet-4-6 stayed put when Sonnet 5 launched. This is a real change from earlier versioning patterns where some aliases moved.

For production integrations, this is the right behavior: you don't want your model silently changing under you. For staying current, it means actively migrating model strings when new versions ship — as everyone pinned to claude-sonnet-4-6 should be doing right now.

Pricing context

Per-token pricing as of July 2026 from Anthropic's documentation:

  • Opus 4.8: $5/M input, $25/M output (Fast mode: $10/$50)
  • Sonnet 5: introductory $2/M input, $10/M output through August 31, 2026; standard $3/$15 from September 1. Updated tokenizer produces 1.0–1.35x more tokens for the same content vs Sonnet 4.6
  • Haiku 4.5: $1/M input, $5/M output

For most professionals working through claude.ai rather than the API, per-token pricing is academic — usage limits are per-plan, not per-token. But if you're using Claude Code (which uses your API tier), or you're integrating Claude into your own product, the per-token pricing is where the per-model cost calculus lives.

Anthropic's pricing page has the current consumer plan details. Verify before committing to a plan.

Bottom line

Leave claude.ai on Sonnet 5 — the default is finally the right answer for most professional work. Reach for Opus 4.8 when the task is genuinely hard code or a long agentic chain, Haiku 4.5 when speed or volume is the point, and Fable 5 for the rare session where maximum capability justifies its separate billing.

The era of "just use Opus for everything" is over — Sonnet 5 matches it on most knowledge work at a fraction of the cost. Pick the model for the task, not the task for the model.

For the deeper dive on the top of the lineup, see our Claude Fable 5 explainer and the Claude Sonnet 5 launch write-up. For Claude vs ChatGPT, see our profession-specific comparison hub.


This article cites model specifications as published in Anthropic's model documentation and pricing as published at claude.com/pricing as of July 2026. Anthropic updates model availability, capabilities, and pricing frequently. Verify current state before procurement or integration decisions.

Free · 60 seconds · No credit card

Curious where AI actually fits your job?

Answer a few questions and get a free, personalized 30-day AI plan for your exact role — the tasks to automate first, and the prompts to do it.

Find my AI wins

Frequently asked questions

Is Opus 4.8 better than Sonnet 5?+

It depends on the task — and as of Sonnet 5 (June 30, 2026), the honest answer is "less often than you'd think." Sonnet 5 matches or beats Opus 4.8 on real-world knowledge-work benchmarks (GDPval-AA v2: 1,618 vs 1,615) while costing less per token. Opus 4.8 still leads on complex software engineering (SWE-bench Pro: 69.2% vs 63.2%) and the hardest multi-step agentic work. Default to Sonnet 5; reach for Opus 4.8 when the task is genuinely hard code or a long agentic chain.

What is the difference between Opus 4.8, Sonnet 5, and Haiku 4.5?+

Opus 4.8 is the most capable of the classic line (1M-token context, best for the hardest code and agentic work). Sonnet 5 is the new default workhorse (1M context, 128K output, matches Opus on most knowledge work, cheaper). Haiku 4.5 is the fastest and cheapest (200K context, best for high-volume and real-time tasks). API pricing per million input/output tokens: Opus $5/$25, Sonnet 5 $2/$10 intro through Aug 31 2026 (then $3/$15), Haiku $1/$5.

Do I still need to switch models on claude.ai?+

Much less than before. Since July 1, 2026, Sonnet 5 is the default for Free and Pro users, and it handles most professional work — including a lot of what used to need Opus. Switch to Opus 4.8 for the hardest coding and long agentic sessions, or to Fable 5 (restored July 1) when you want the most capable generally available model and your plan's Fable usage allows it.

Which Claude model should I use on claude.ai?+

Leave the model on Sonnet 5 — it's the default for a reason. Switch to Opus 4.8 for genuinely hard code or long multi-step agentic work. Pro ($20/month, or $17/month billed annually) and above unlock the full picker, and there is no per-token cost on claude.ai — just per-plan usage limits — so switching costs you nothing when the task warrants it.

How much does each Claude model cost?+

API pricing as of July 2026: Opus 4.8 — $5/M input, $25/M output (Fast mode $10/$50); Sonnet 5 — introductory $2/M input, $10/M output through August 31, 2026, then $3/$15 (note: Sonnet 5's updated tokenizer produces 1.0–1.35x more tokens for the same content); Haiku 4.5 — $1/M input, $5/M output. On claude.ai you pay per plan, not per token: Free (Sonnet 5 plus limited Haiku), Pro $20/month ($17 annual), Max $100–$200/month, Team $25–$125/seat, Enterprise custom.

What happened to Claude Fable 5 — can I use it now?+

Yes. Fable 5 launched June 9, 2026, was suspended June 12 under a US government export-control directive, and was restored globally on July 1. Through July 7 it's included for up to 50% of weekly usage limits on Pro, Max, Team, and select Enterprise plans; after July 7 it moves to credits-based billing. It's Anthropic's most capable generally available model — a Mythos-class tier above Opus. See our Fable 5 explainer for the full story and billing details.

Topics:Claude
By Reviewed by Alex LowePublished May 20, 2026Updated July 3, 2026

Related Guides

Get weekly AI tips for your profession

Join thousands of professionals saving hours every week with AI. Free. No spam.