The Confusing World of LLMs

Andy Abramson
4 min readSep 29, 2024

--

Ladies and gentlemen, fellow explorers of technology, I stand before you today with a simple confession: I’m confused! The world of technology has always been vast, but today, it feels like we’re not just navigating a sea of change — we’re drowning in it. Everywhere you look, there’s a new acronym, a new promise, a new frontier, and no space is more emblematic of this whirlwind than the universe of large language models — LLMs.

Anthropic has Claude. OpenAI has ChatGPT Google has Gemini. Meta has Llama 3.1. And, like you, I’m wondering: “Which one do I choose?” If you’re feeling lost in this vast array of options, know that you’re not alone. Even the great Steve Jobs would bescratching his head on this one.

Let’s talk about Claude 3.5 from Anthropic. Now, this model has an impressive ability to dive deep into context. It’s like speaking with someone who not only remembers what you said last week but understands how it connects to what you’re saying now. For multilingual tasks — writing across languages, translating on the fly — it’s an absolute gem. But, and this is a big but, it’s missing something. It doesn’t handle images, videos, or audio with the same ease some of its competitors do. So if you’re looking for raw, thoughtful analysis or that crucial contextual insight, Claude is your friend. But if your task is multimedia-heavy, you might feel like you’re missing a few pieces.

Now, OpenAI’s ChatGPT, especially its latest version, GPT-4, is what I would call the multitasker’s best companion. It’s a Swiss Army knife — it codes, it writes, it speaks, and it even interprets images. With recent upgrades, it can now process voice and visuals. That’s huge. You’re no longer limited to text; you can engage with it in ways that are tactile, that feel human. But, as versatile as ChatGPT is, it sometimes struggles with more nuanced, context-heavy tasks. If you need a deep-dive analysis — well, Claude might still have the edge.

Then there’s Google’s Gemini 1.5 Pro. Multimodal processing — it’s a fancy way of saying it sees, hears, and understands. Google has always been brilliant at this. Gemini not only processes text but also images, video, and sound seamlessly. If you live within Google’s ecosystem — think Workspace, Pixel devices, and all things Android — Gemini is like that assistant who’s always one step ahead of you. But if you’re someone who prefers to mix platforms and stay agnostic in your tools, this integration might feel a bit too restrictive. You’ll love it… if you’re all-in with Google.

Now, Meta’s Llama 3.1 is a different beast. It’s open-source, powerful, and made for those who want to build something their way. It can process enormous documents, and I mean huge — we’re talking up to 128,000 tokens. That’s like handing your AI a stack of War and Peace twice and asking it to summarize. But, and this is crucial, it’s not as user-friendly as the others. It’s not the slick, out-of-the-box experience you’d get with ChatGPT or Gemini. It’s more complex, more technical, but if you want something deeply customized, there’s nothing like it.

So, here we are: four incredible tools, each with its own strengths and weaknesses. But I think the real question isn’t *which* is the best — it’s
what do you need?

- Do you need a nuanced analysis, something that digs deep into language and meaning? Go with Claude.
- Do you need versatility, the ability to write, code, and analyze all in one platform? Go with ChatGPT.
- Do you live in a world of images and video, where multimodal input is key? Gemini is for you.
- Or, are you building something unique, something customized and vast? That’s where Llama 3.1 shines.

This isn’t a competition; it’s a toolbox. Each model is a tool, and the art lies in picking the right one for the job. As I’ve always believed, simplicity is the ultimate sophistication. And yet, even in simplicity, you have to choose the right components, the right partners, and the right tools to create something beautiful.

In the end, the best technology is the one that helps you tell your story, you build your vision. These LLMs are just that — tools. So, before you get lost in the technical jargon, ask yourself: what are you trying to achieve? Define your goals, then pick the tool that gets you there fastest, simplest, and with the least friction.

That’s where true innovation lies. It’s not in the machine. It’s in what *you* do with it.

--

--