Claude Opus 4.6: What Early Business Results Look Like

Anthropic released Claude Opus 4.6 on February 5, 2026. The coverage focused on benchmarks and capability comparisons — Opus 4.6 leads on finance agent tasks, legal reasoning, knowledge work, and agentic search. That's useful context, but it doesn't tell you what the model actually does for a business that deploys it.

Here's what early adoption looks like across industries where Opus 4.6 is being put to serious use.

What's Actually New in Opus 4.6

Three changes matter for business users:

Agent teams. You can now orchestrate multiple Claude instances working in parallel on different parts of a complex task. One agent researches, one drafts, one reviews. The output quality for complex multi-step work improves meaningfully when tasks are parallelised this way rather than handed to a single prompt.

1 million token context window. In beta, but available. That's enough to load an entire year of customer records, a complete legal case file, or a full product specification into a single session. The model reasons across all of it coherently — no chunking, no lost context, no stitching outputs together manually.

Adaptive thinking. The model adjusts how much compute it applies based on task complexity. Simple questions get fast answers. Complex analysis gets deeper reasoning. This matters for cost management — you're not paying for heavy computation on tasks that don't need it.

Finance: Screening a $2.2 Trillion Portfolio

The most striking early deployment: Norway's Government Pension Fund Global — a $2.2 trillion sovereign wealth fund — began using Claude Opus 4.6 in February to screen its portfolio for ESG risks.

The fund monitors thousands of companies for issues including forced labour, corruption, environmental violations, and governance failures. Previously this required a team of analysts working across fragmented data sources. The Claude deployment processes company filings, news, NGO reports, and regulatory disclosures together — surfacing risk flags that enable earlier divestments and more consistent monitoring at scale.

This isn't theoretical. The fund manages more capital than most national economies. The fact that they're trusting Opus 4.6 with this analysis is a signal worth paying attention to.

Legal: From Research to First Draft in Hours

Law firms using Opus 4.6 for research and drafting report two patterns. First, the model's performance on legal reasoning benchmarks (where it leads the field) translates to noticeably better quality first drafts — fewer hallucinated citations, more coherent argument structure, better handling of jurisdictional nuance.

Second, the 1M token context window is genuinely useful for contract analysis. A typical commercial lease or M&A agreement runs 40,000–80,000 words of dense legal text. Feeding the full document produces better analysis than working in chunks — the model can cross-reference clauses, identify internal inconsistencies, and flag where definitions in one section affect obligations in another.

For smaller firms, this doesn't replace legal expertise. It compresses research and first-draft time from days to hours — letting lawyers spend their time on judgment calls rather than document review.

Product and Operations: The Agent Teams Use Case

The agent teams feature is showing the most creative early use cases. One pattern emerging: a research agent, a drafting agent, and a review agent running in parallel on content production.

The research agent reads source material and builds a structured summary. The drafting agent takes that summary and produces a first draft. The review agent checks the draft against the source material for accuracy and flags anything that needs clarification. The total time from input to reviewed draft drops dramatically compared to single-prompt approaches — and quality is more consistent.

This workflow is being used for: proposal writing, market research reports, product specification documents, and competitive analysis. The common thread is tasks that previously required a person to do multiple sequential steps over hours.

The "Vibe Working" Shift

Anthropic used the phrase "vibe working" around the Opus 4.6 launch — the idea that the model is capable enough that you can describe what you want in natural terms rather than carefully engineered prompts. You don't need to structure your request like a software spec; you can describe the outcome you're after and iterate conversationally.

75% of Anthropic's enterprise customers are now using Claude models in production. The profile of who's using it has expanded: product managers, financial analysts, operations leads — not just developers. That shift is partly driven by Opus 4.6 being capable enough to handle ambiguous, real-world tasks without needing a prompt engineer to mediate.

What This Means for SMBs

You don't need to be a sovereign wealth fund to benefit. The capabilities that matter for small and mid-size businesses:

Long document analysis — contracts, supplier agreements, compliance documents
Research to draft workflows — proposals, reports, marketing content
Financial analysis — reading and summarising reports, flagging anomalies, building summaries for decisions

If you've tried earlier Claude versions and found them useful but inconsistent on complex tasks, Opus 4.6 is worth a fresh test. The jump in reasoning quality on multi-step work is measurable — not marginal.

The best starting point: take the most time-consuming analytical task your team does regularly, and give Opus 4.6 a real attempt at it with full context. The 1M token window means you don't have to simplify the input. Give it the whole thing and see what comes back.

Claude Opus 4.6 Launched. Here's What Early Business Results Actually Look Like.

What's Actually New in Opus 4.6

Finance: Screening a $2.2 Trillion Portfolio

Legal: From Research to First Draft in Hours

Product and Operations: The Agent Teams Use Case

The "Vibe Working" Shift

What This Means for SMBs

Related articles worth reading next

We OCR'd 40 Receipts With AI. Here's What It Actually Cost.

How We Built a Fraud Detection System for a Payment Processor — For Under $12/Month

How We Built an Auto-Blogging Pipeline for Nearly Nothing

Need help deciding what to build or teach first?