Which Claude Model Should You Use? Opus 4.7 vs Sonnet 4.6 vs Haiku 4.5
A practical guide to choosing between Claude Opus 4.7, Sonnet 4.6, and Haiku 4.5 for professional work in 2026. With profession-specific recommendations for the most common workflows.
Anthropic ships three generally-available Claude models as of May 2026: Opus 4.7, Sonnet 4.6, and Haiku 4.5. They share a common feature surface (text + image input, vision, tool use, Skills, Projects) but differ meaningfully on speed, depth, cost, and context window. Picking the right one for the right task is the highest-leverage tooling decision most professionals make in 2026 — and the documentation, while thorough, doesn't translate directly into "which model for which job."
This guide does the translation. Specs cited come from Anthropic's model documentation as of May 2026.
The three models at a glance
| Opus 4.7 | Sonnet 4.6 | Haiku 4.5 | |
|---|---|---|---|
| Position | Most capable; deliberative work | Best speed-intelligence combo | Fastest; near-frontier intelligence |
| API ID | claude-opus-4-7 |
claude-sonnet-4-6 |
claude-haiku-4-5-20251001 |
| Context window | 1M tokens | 1M tokens | 200K tokens |
| Max output | 128K tokens | 64K tokens | 64K tokens |
| Pricing (API) | $5/M input, $25/M output | $3/M input, $15/M output | $1/M input, $5/M output |
| Latency | Moderate | Fast | Fastest |
| Extended thinking | No | Yes | Yes |
| Adaptive thinking | Yes | Yes | No |
| Reliable knowledge cutoff | Jan 2026 | Aug 2025 | Feb 2025 |
The single most important question: are you on claude.ai or the API?
If you're on claude.ai (the consumer/team chat product), the model picker handles model selection per-message. Pro ($17/mo annual or $20/mo monthly) and above get access to all three. Switch freely as the task changes — there's no per-token cost to you, just usage limits per plan.
If you're using the API (or Claude Code, which uses the API tier you're on), each request costs by token. Opus is 5x the input cost and 5x the output cost of Haiku. Picking the right model per task isn't optional at scale; it's where the cost discipline lives.
Most of this guide is written for the API-aware professional and the claude.ai user who cares about output quality enough to switch models per task. If you're a claude.ai user who just wants one default, pick Sonnet 4.6 unless the task is clearly deliberative or clearly volume-driven — Sonnet is the best workhorse default for most professional work.
Pick Opus 4.7 for: deliberative work where output quality matters more than latency
The classic Opus use cases for professionals:
- Long-form structured documents — full memos, full disclosure drafts, multi-section PRDs, full case analyses. Anywhere you need coherence across 10+ pages
- Long-document analysis — full case files, full medical histories, full board packages, full policy documents. The 1M context combined with Opus's reasoning depth makes the document-as-context pattern work well
- Multi-step agentic workflows in Claude Code — Opus 4.7's step-change agentic improvement matters most when tasks chain across multiple steps. If you're using Claude Code for compliance automation, data extraction pipelines, or anything multi-step, Opus is the model
- Anything where you'd otherwise re-prompt twice — Opus's adaptive thinking on complex requests typically eliminates the second prompt
Opus is moderate latency by Anthropic's classification. For most professional drafting work, that's the right tradeoff. You're not interactively chatting; you're producing a deliberative artifact. The 5-15 seconds Opus takes for a thoughtful response is invisible compared to the time you'd spend re-prompting Sonnet to fix what it missed.
Profession-specific Opus 4.7 wins
- Attorneys drafting long-form memos, motions, or contract risk summaries from full source documents
- Healthcare compliance officers running QSR gap audits across full DHF documents
- AI compliance officers producing pre-legal regulatory screens across EU AI Act tiers + Annex III + US state overlays
- ESG sustainability analysts mapping KPIs to multiple frameworks from a 200-page sustainability report
- Management consultants synthesizing a week of meeting notes into a single strategy memo
- Data scientists producing Model Cards or stakeholder briefs from full experimental records
Pick Sonnet 4.6 for: the day-to-day workhorse
If Opus is the deliberative-drafting model, Sonnet is the everything-else model. Sonnet 4.6 is the right default for most professional work most of the time:
- Borrower / client / customer communication — pipeline updates, status emails, scheduling, the daily back-and-forth where speed matters more than depth
- Short-form analysis — quick gut-checks, methodology lookups, the fast question-and-answer that helps you think through a problem
- Iteration on drafts produced by Opus — when you're polishing rather than producing
- Anything where you want both Extended Thinking AND Adaptive Thinking available — Sonnet 4.6 supports both modes; Opus 4.7 supports only adaptive. For some workflows the explicit Extended Thinking mode is the right tool
- Document analysis that fits in 200K-1M tokens — Sonnet's 1M context handles full case files, full reports, full transcripts
Sonnet at $3/$15 per million tokens is 40% cheaper input and 40% cheaper output than Opus. At scale, that compounds.
Profession-specific Sonnet 4.6 wins
- Loan officers running the four-audience pipeline update workflow daily across 8+ active loans
- Real estate agents generating listing descriptions, client emails, market updates — all short-to-medium-form, all volume-heavy
- Copywriters drafting article sections, sales-page sections, email sequence emails — iteration-heavy work
- Community managers drafting moderation responses, onboarding messages, sentiment-report summaries
- AI product managers structuring feature specs and rollout plans — Sonnet handles the structure well at a third of Opus's cost
- Prompt engineers generating synthetic test cases and skill audits — high-volume eval work
- Most healthcare clinicians drafting SOAP notes, treatment plans, patient education — short-to-medium form, frequency-heavy
Pick Haiku 4.5 for: speed and scale
Haiku 4.5 is the fastest current Claude model and the cheapest. It's also more capable than people expect — "near-frontier intelligence" per Anthropic's framing. The places it earns its place:
- High-volume classification or extraction tasks — running over thousands of inputs per day
- Real-time chat surfaces — when the response needs to feel instantaneous (in-product chatbots, customer support assistants)
- Background AI features inside production applications — the AI that runs invisibly behind a feature, where latency directly affects user experience
- Cost-sensitive workflows where Sonnet's depth isn't worth the price — at $1/$5 vs $3/$15, Haiku is 67% cheaper
Haiku's constraints to know:
- 200K context window, not 1M. Long documents need chunking
- No adaptive thinking. Haiku has Extended Thinking only, which you have to invoke explicitly
- Reliable knowledge cutoff is Feb 2025 — older than Sonnet (Aug 2025) and Opus (Jan 2026). For questions about events in 2025 or 2026, Haiku may have stale knowledge
Profession-specific Haiku 4.5 wins
- Customer support functions building in-product AI chat where latency matters
- Recruiters / HR running high-volume resume classification or initial screening (with appropriate human-in-the-loop and EEOC-aware guardrails — see our recruiter audit guide)
- Sales teams generating high-volume personalized outreach where the template is the value-add and depth matters less
- Internal tooling — Slack bots, internal knowledge search, the AI behind ops dashboards
- Real-time content moderation in community management workflows
The decision tree
If you only remember one decision rule, use this:
- Is the output going to a regulator, an auditor, a court, or a board? → Opus 4.7. Pay for the depth
- Is the output going through a multi-step agentic workflow? → Opus 4.7. Pay for the coherence
- Is the task short-form, iterative, or high-volume but still requiring real intelligence? → Sonnet 4.6. The workhorse
- Is latency the user-facing experience, or is this running at thousands-per-day volume? → Haiku 4.5. The fast option
When to switch mid-conversation
On claude.ai you can switch models mid-conversation. The pattern that works for serious work:
- Start in Sonnet 4.6 to think through what you're trying to produce, get a quick first draft, and iterate on the structure
- Switch to Opus 4.7 when you're ready to produce the substantive deliverable — the document that's actually going somewhere
- Switch back to Sonnet 4.6 for polishing, follow-up questions, and quick iterations
This pattern uses Opus where its depth pays for itself (the deliberative work) and Sonnet where its speed pays for itself (everything else). It's the discipline most professional Claude.ai users converge on after a few weeks.
What about the legacy models?
Anthropic still publishes documentation for Opus 4.6, Sonnet 4.5, Opus 4.5, Opus 4.1, and the original Claude 4 models. Claude Sonnet 4 (claude-sonnet-4-20250514) and Claude Opus 4 (claude-opus-4-20250514) are deprecated and retire on June 15, 2026. If your integration is on one of those, migrate before then.
The other legacy models (4.1, 4.5, 4.6) remain available but are not recommended for new integrations. They're around for users who pinned to specific snapshots and need time to retest.
How API model versioning works
One subtle change starting with Claude 4.6: model IDs are pinned snapshots, not evergreen pointers. claude-opus-4-7 doesn't auto-upgrade when Anthropic releases Opus 4.8 — it stays Opus 4.7. This is a real change from earlier versioning patterns where some aliases moved.
For production integrations, this is the right behavior: you don't want your model silently changing under you. For staying current, it means actively migrating model strings when new versions ship.
Pricing context
Per-token pricing as of May 2026 from Anthropic's documentation:
- Opus 4.7: $5/M input, $25/M output
- Sonnet 4.6: $3/M input, $15/M output
- Haiku 4.5: $1/M input, $5/M output
For most professionals working through claude.ai rather than the API, per-token pricing is academic — usage limits are per-plan, not per-token. But if you're using Claude Code (which uses your API tier), or you're integrating Claude into your own product, the per-token pricing is where the per-model cost calculus lives.
Anthropic's pricing page has the current consumer plan details. Verify before committing to a plan.
Bottom line
Most professionals working in claude.ai should default to Sonnet 4.6 and switch to Opus 4.7 for the deliverables that genuinely matter. Most API users should pick per-task (Opus for depth, Sonnet for volume, Haiku for speed and scale).
The era of "just use Opus for everything" is over — Sonnet 4.6 is good enough for most work and 40% cheaper. The era of "just use Sonnet for everything" is over too — Opus 4.7's 1M context and agentic improvements make it the right choice for the work where output quality actually matters.
Pick the model for the task, not the task for the model.
For more on what changed in Opus 4.7 specifically, see our Claude Opus 4.7 for Professionals deep dive. For Claude vs ChatGPT, see our profession-specific comparison hub.
This article cites model specifications as published in Anthropic's model documentation and pricing as published at claude.com/pricing as of May 2026. Anthropic updates model availability, capabilities, and pricing frequently. Verify current state before procurement or integration decisions.
Save hours every week with the AI Career Lab — All 7 AI Cowork Vaults
All seven profession-specific AI Cowork Vaults — 315 skills total. Works on Claude Cowork and Microsoft 365 Copilot Cowork.
Related Guides
AI for AI Compliance Officers: Govern the System Without Becoming the Single Point of Failure
How working AI compliance officers are using AI in 2026 — pre-legal risk classification under the EU AI Act, regulatory update triage, QMS and conformity assessment starting structures, and autonomous-agent eval harnesses with quantitative pass/fail thresholds.
AI for AI Product Managers: Ship Features Without Becoming the Regulatory Bottleneck
How working AI product managers are using AI in 2026 — structured feature specs, pre-legal regulatory screens, staged rollouts with quantitative kill criteria, and user feedback synthesis that splits model issues from product issues.
AI for Community Managers: Offload the Grind, Keep the Strategy
How working community managers are using AI in 2026 — moderation playbooks, 7-day onboarding sequences, sentiment monitoring, and stakeholder-ready health reports.