Anthropic released Claude 3.5 Sonnet last month, and I've spent the last three weeks testing it against Claude 3 Opus, GPT-4o, and Gemini on the workflows I actually implement for professional services. The conclusion is unambiguous: for most firms, Sonnet is the right choice. Not for all tasks, but for the vast majority.

Here's why I'm recommending it to everyone and how to deploy it.

The Benchmark Numbers

On standard benchmarks, Claude 3.5 Sonnet outperforms Claude 3 Opus on most metrics. That's unusual—normally you need to go to the flagship model to get better performance. This time, better engineering means the mid-tier model wins.

Specific numbers matter less than directional comparison, but: Sonnet is faster than Opus, cheaper than Opus, and quality is comparable or better on most professional services tasks.

What Changed

Speed: Sonnet is noticeably faster than both Opus and GPT-4o. For workflows where latency matters (real-time client interaction, bulk processing), Sonnet is significantly better.

Quality on practical tasks: On document analysis, legal reasoning, research synthesis, and code generation, Sonnet is equivalent to Opus. There are edge cases where Opus is better, but they're rare in professional services workflows.

Cost: Cheaper per token than Opus. For high-volume workflows, this matters.

Consistency: Fewer hallucinations than earlier models. More reliable output. Better for production systems.

Where Sonnet Wins

Document processing: Contract analysis, proposal generation, intake summaries. Sonnet is excellent. I've tested it on hundreds of contracts. Quality is reliably high.

Code generation: If you're building internal tools or automating workflows with code, Sonnet outperforms Opus on code quality and speed.

Research and analysis: Summarizing documents, synthesizing information from multiple sources, identifying patterns. Sonnet handles this well.

Email and communication drafting: Client emails, internal memos, status updates. Sonnet produces output that requires minimal editing.

Where You Still Need Opus

Complex legal reasoning: If your task requires understanding edge cases, identifying subtle risks, or reasoning about unusual contract language, Opus is better. Not dramatically, but measurably.

Strategic analysis: Open-ended strategic questions where you need deep reasoning and nuance. Opus is more reliable.

Edge cases: If you hit something Sonnet can't handle, escalate to Opus. Budget for 5-10% of your most complex workflows going to Opus.

The Practical Recommendation

Here's what I'm implementing for every client now:

The Cost Picture

For a typical professional services firm processing 5-10M tokens per month across all workflows:

All Claude 3 Opus: $500-1,000/month

All Claude 3.5 Sonnet: $150-300/month

Blended (90% Sonnet, 10% Opus): $200-400/month

Quality difference: negligible. Cost difference: 3-5x.

The Risk

My only concern is: what if a specific workflow needs Opus and you don't know it until you're in production? The answer: that's fine. Your system should log which model was used and which outputs required escalation. After two weeks in production, you'll know which tasks genuinely need Opus. Route those to Opus going forward. The rest stay on Sonnet.

What to Do This Week

If you're already running workflows on Claude 3 Opus or GPT-4o, take one major workflow and run it on Sonnet for a week. Compare quality. Compare latency. Compare cost. My prediction: you'll switch to Sonnet for that workflow, and you won't look back.

Want to discuss AI strategy for your firm?

Book a free 30-minute assessment — no pitch, just practical insights.

Book a Call