Last month, Anthropic released Claude Opus 4, and I've spent the last four weeks testing it extensively. After 30 years in technology, I've learned to be skeptical of "most capable" claims. But this one is real.

This is my honest assessment: what makes Opus 4 special, where the trade-offs are, and whether it changes how your firm should approach AI.

What Changed from Sonnet to Opus 4

For context: Claude 4 Sonnet (released May 2025) is the fast, practical model. Opus 4 is the deep-thinking model. The difference shows immediately:

Reasoning Depth

Opus 4 handles multi-step analysis that would require multiple Claude 4 Sonnet calls. I tested it against GPT-4, and Opus 4 consistently produced more logically coherent outputs on complex scenarios:

The difference is structural. Opus 4 reasons through problems differently than Sonnet. It doesn't just retrieve and organize—it actually works through logic.

Context Window and Consistency

Opus 4 handles 200,000-token context windows (about 150,000 words). More importantly, it stays consistent throughout. I loaded entire contracts, financial statements, and regulation sets, and Opus 4 maintained reasoning quality across all of it.

Sonnet starts to lose coherence on very long documents. Opus 4 doesn't.

Code and Technical Reasoning

If your firm uses AI for technical work (data analysis, automation, system design), Opus 4 is noticeably better. It understands edge cases and suggests better architecture.

The Trade-Off: Speed and Cost

Opus 4 is slower and more expensive. That matters.

The economics matter. At scale, you're not replacing Sonnet with Opus 4. You're using each for what it's good at.

When to Use Opus 4

Based on four weeks of testing, I recommend Opus 4 for:

High-Stakes Strategic Work

Analysis that directly impacts client decisions. Mergers, reorganizations, major strategy shifts. The extra reasoning depth reduces errors and catches edge cases. Sonnet can do this, but Opus 4 is safer.

Cost: One Opus 4 call might cost $2–$5. If it prevents one bad recommendation, it's worth 100x.

Complex Compliance and Regulatory Analysis

Parsing regulations, identifying gaps in controls, building compliance frameworks. Opus 4 is exceptional at this because it actually reasons through logical constraints.

Code and System Design

If you're using AI for technical work, Opus 4 produces better code and suggests better architecture. Fewer bugs, cleaner design.

Long-Document Analysis

When you need to analyze full contracts, regulatory sets, or financial statements end-to-end, Opus 4's context window and consistency matter.

When NOT to Use Opus 4

Don't use it for:

What This Means for Your AI Stack

If you're building an AI capability in 2025, this is my recommendation:

The Enterprise Implication

By June 2025, we're reaching a moment where enterprise AI comes down to judgment, not capability. The models are good enough for almost any professional services use case. The question is: where do you apply them and how?

Opus 4 represents that shift. It's not revolutionary. It's evolutionary. But evolution matters when you're building something at scale.

My Recommendation

If your firm is serious about AI, get access to Opus 4. Try it on your hardest problems. You'll find 2–3 workflows where it makes a real difference. Those become your use points.

By June 2025, that's how you compete: not by having AI, but by using the right AI for the right problem.

Want to discuss AI strategy for your firm?

Book a free 30-minute assessment — no pitch, just practical insights.

Book a Call