Meta released Llama 2 as open source this week. If you work in professional services and you're thinking about AI adoption, this changes your options in meaningful ways. Not because Llama 2 is suddenly better than proprietary models—it isn't—but because the economics shift entirely when you can run AI on your own servers.
The Open Source Advantage
Let me be direct: right now, most professional services firms are paying subscription fees to use AI through cloud APIs. You type into ChatGPT or Claude, the prompt goes to OpenAI or Anthropic's servers, and you get back a response. Each prompt costs you a small amount per token.
That model has two problems for professional services. First, your work product—client communications, strategic documents, sensitive information—is traveling across the internet to third-party servers. Some firms have compliance or confidentiality concerns with that arrangement.
Second, the economics don't scale well. At high volume, you're paying OpenAI to solve problems you could solve more cost-effectively on your own infrastructure.
Open source models like Llama 2 change this. You can download the weights, run the model on your own servers, and process unlimited prompts for the cost of the hardware and electricity. No API fees. No data leaving your network.
The Practical Reality Today
But—and this is important—Llama 2 isn't at GPT-4 quality levels. It's competitive with earlier versions of GPT-3.5. For some tasks, that's more than enough. For others, you still need the more capable proprietary models.
If you're doing simple email triage, meeting summaries, or initial intake automation, Llama 2 is excellent and you could run it on-premise right now. The cost difference is dramatic: essentially just your infrastructure costs instead of OpenAI's per-token charges.
If you're doing complex contract analysis or nuanced legal research, you still need Claude or GPT-4. Those capabilities matter more than cost savings.
What This Enables
The real opportunity for professional services firms is a hybrid approach: use Llama 2 on-premise for the high-volume, routine work that doesn't require maximum capability, and use proprietary models through APIs for the work that does.
This looks different from firm to firm. A legal practice might run Llama 2 to summarize client intake questionnaires and draft initial engagement letters, then use Claude for complex document analysis. An accounting firm might use Llama 2 to categorize and summarize client communications, then use GPT-4 for technical research.
You get cost control, data privacy, and the ability to customize the model for your specific workflows. Llama 2's weights are open, which means you can fine-tune it on your firm's historical data if you want to build something truly proprietary.
The Barriers Are Real
I need to be honest about the friction: running Llama 2 yourself requires infrastructure knowledge that most professional services firms don't have in-house. You need to provision servers, manage GPU capacity, handle updates, monitor uptime. It's not trivial.
There are managed Llama 2 services starting to appear (cloud providers are adding it to their AI services), which reduce the friction. But you're still making a technical decision that goes beyond the "buy a subscription" model.
For some firms, the complexity won't be worth it. For others—especially larger firms with IT teams and security concerns—it's a significant shift.
The Timing
Open source AI is still early. Llama 2 is the first genuinely usable open source model at scale. The question for your firm isn't "should we switch to Llama 2 today" but "should we plan a strategy that includes open source models as a key piece of our AI infrastructure?"
I think the answer is yes, especially if confidentiality or compliance are concerns for your client work. The technology is ready. The economics are compelling. What you need is a clear use case and the willingness to invest in a slightly more complex setup.
The firms that get ahead of this will have cost and privacy advantages that their competitors won't match for another two years.
Want to discuss AI strategy for your firm?
Book a free 30-minute assessment — no pitch, just practical insights.
Book a Call