AI Update #19 - How Meta is transforming the AI Landscape
Meta’s open AI approach is fostering competition and challenging OpenAI and Google.
Over the weekend, during dinner, I was asked why Meta released LLama for free, instead of charging a lot of money like OpenAI and Anthropic do for ChatGPT and Claude?
To be the agent of chaos I strive to be, I gave them a conspiracy answer and moved on to the next bottle of wine.
LLama is a trojan horse from Meta, their team has been secretly aligning the model to push users to spend more time on Meta’s apps and generate more ad revenue. Of course, since neural networks are not explainable, they maintain perfect plausible deniability.
Judging by a “it makes so much sense!” text I got hours later, I guess the conspiracy theory answer worked.
Unlike my answer, the question is really valid. I touched this topic after LLama 3 was released, but many things have changed since then. Let’s unravel the events together and see how this choice played out for both Meta and the rest of the AI world.
Meta’s Contribution to OSS
Meta is not new to Open Source Software (OSS). React is the world’s most used frontend framework, if you’re reading this post from the web or the app, you’re using it right now. PyTorch is the most popular machine learning library, pretty much every LLM and ML algorithm has been written on top of it.
What do these 2 libraries have in common? They’re both OSS coming out of Meta’s research. And with them, another 130+ projects ranging from website building tools like docusaurus—best name ever—to infrastructure software and standards like the Open Compute Project. Meta’s contribution to OSS has been transformational.
Such contributions didn’t happen out of good heart, the reason can, in fact, be broken down in two parts: cost saving and competition control, so that’s where we will focus.
Cost Saving
In April, Mark Zuckerberg did an interview with Dwarkesh Patel where he was asked:
DP: “Would you opensource a 10bn$ model?“
MZ: “As long as it’s helping us, yeah“.
How does open-sourcing AI help Meta? In the same way the Open Compute Project helped them massively reduce datacenter costs. Once their designs were standardized, the supply chain became more efficient, components became cheaper, and, as an added bonus, everything fit by default in Meta’s infrastructure.
In other words, they spurred competition outside their core business that directly benefited them and pretty much everyone with a datacenter. This, in turn, eliminated Meta's reliance on a few selected manufacturers and reduced costs at the same time.
By open-sourcing compute architectures, they commoditized them.
We discussed how AGI is not exactly around the corner, and among other reasons, one major factor is that AI research is very expensive. In the absence of collaboration and in a completely closed ecosystem, any major finding must be rediscovered each time with no guarantee that it will happen for each major discovery. That’s where an open1 LLM helps.
Fundamental innovations like LoRa/QLoRa and model quantization were researched and refined for at-scale deployments almost single-handedly thanks to LLama. The same can be said about llama.cpp—a library to run LLMs on consumer hardware that is today the base upon which virtually every open LLM works—or Alpaca, Vicuna, LLaVa, and countless others. An open LLama was also an enabling factor for research on ethical AI, bias, discrimination, and safety. The AI ecosystem created by Meta is massive.
All this research "outsourced" to the community directly translates to cost savings for Meta. Imagine the impact of reducing the cost of running meta.ai by just a few percentage points for a project that runs at such a massive scale.
A natural question at this point would be: isn’t Meta commoditizing itself? My take is that this is not the case. Meta is not LLama. In fact, LLama is a service offered on top, but the data it has access to, when served for instance on Instagram, is only accessible to Meta itself. Also, Meta, unlike OpenAI and Anthropic, doesn’t make money by selling LLMs or APIs to access them.
Impact on Semiconductors
The impact of Meta’s strategy can be seen at very different layers. I spent the past months talking to companies developing custom chips/ASICs for inference and edge AI. I’ve already discussed how, in search of a sustainable business model for AI, a lot of research is going into inference optimization, and we also ran the numbers to understand the cost sustained by companies when serving a LLMs at scale. So it’s not surprising to see the focus on developing innovative inferencing solutions vs training solutions. The general perception is that training happens once, but inference is a continuous process, and to an extent, this is true.
I’ll get back (again) to the inference market in a future analysis, but for now the important point to note is that all companies designing their own inference chips are always supporting out-of-the-box one particular LLM. Guess who? Of course it’s our favourite camelid LLama.
These chips are not specifically designed for or around LLama, in most cases they can run any Transformer quite efficiently. If you consider that chips do require a software stack, and that software stack is complex and costly to develop, then you wouldn’t be surprised to know that the first thing chip manufacturers package them with is the most popular LLM out there.
Going back to the main point: Meta has created a strong incentive for all newly designed chips to immediately support their architecture. Not only does this create a hardware+software platform of enormous value, but it also shields Meta from supply constraints. Meta already designs its own chips, like most other large companies and hyperscalers, but what if they need to scale beyond what their supply chain can fulfill? Having a number of new manufacturers that are capable of running your own architecture efficiently is a great way to mitigate future supply issues and ensure diversification.
If no one released an open and capable LLM to the world, the inference market would look very different today, in fact there would be little reason to design chips to run an LLM that is in the hands of two companies.
Competition Control
Meta is not just saving money and getting “free” R&D; it’s creating a platform. Platforms are powerful if you have the user base, and commoditizing the platform is a way to defang those who keep LLMs segregated: OpenAI and Anthropic being the first.
In a hypothetical world where all LLMs are closed and only accessible via API, the success of a platform would depend heavily on the capabilities, cost, and transparency of each LLM provider. Yes, transparency. You can’t build a reliable product on a black box that changes every few weeks without warning as to what and why. OpenAI demonstrated this concept clearly. I’d like to figure out exactly how much money OpenAI makes from the GPT Store, but I suspect it’s not their major revenue driver.
Once the core of your platform, the LLM, becomes a commodity, the community can start to build on it. Do we all need access to GPT-4 or Claude every time? Most definitely not, so why shouldn’t we differentiate? This means enterprises can rely on a private possibly local (and cheaper) model for most of their tasks and defer to the premium ones only when needed.
Mind you, this strategy only works if you don’t make money directly from the platform. OpenAI makes money from applications built on top of ChatGPT, and the same goes for Anthropic. We already mentioned how Meta does not make money on Llama2, but they make sure that competitors in the space become less profitable. Your face might looks like this (below) but let me explain.
Meta is encouraging competition at the application level. The more players are in this space, the less profitable it becomes. At some point, consumer surplus shrinks, and those providing the "commodity" at a premium start to see a ceiling on their margins that won't go away. This move also prevents the emergence of a monopoly (think of Intel funding RISC-V to mitigate the threat from ARM). In the end, competition is far away from Meta, and it helps ensure that other providers won’t become (too) dominant. But it’s also a win for both app-builders and consumers: "my LLM, my rules" or in other words, no walled gardens, no vendor lock-in, no unwanted/sudden/unexplained changes.
I digress here, but this strategy won’t work if you actually profit from the commodity you’re creating, or in a word: StabilityAI. What Emad did—releasing StableDiffusion to the world for free—created a community almost as large as the one that exists around Llama, but it didn’t benefit the company in the slightest. Despite that, thank you Emad for allowing the rest of us to get into image generation, something that would otherwise have never happened.
OpenAI, Anthropic, Mistral, and others will keep competing for users as consumers are sensitive to quality. GPT-4o fixed GPT-4 issues in the worst possibly way, and when Claude 3.5 (Sonnet, not even Opus) was released, users moved there en masse to the point that they brought Claude down (okay, this last part is anecdotal!).
OpenAI has reportedly been doing very well, although now that I moved my $20 from OpenAI to Anthropic, I am absolutely sure they will have a hard time meeting their revenue targets. But Claude 3.5 (again, Sonnet!) feels like such a major upgrade over ChatGPT that the “user migration” I’ve been reading about might really be happening, at least at the consumer level. We’ll know more in Q4. The fact remains that most of these users will keep bouncing from one to the other as capabilities evolve, and in the process, Meta, through meta.ai embedded in IG, WhatsApp and Facebook, will capture a good chunk of the “non-power user” base.
Meta Is Not Alone
Meta might have been the first, but the company is certainly not alone anymore. The concept of commoditizing your complement was certainly not foreign to Google or Microsoft
Google just released Gemma 2. While it’s too early for a verdict, the majority of people who managed to run it using the aforementioned llama.cpp are reporting very good performance. Some say—and take this with a grain of salt—that the 27B version outperforms LLama 3 70B. Results on Chatbot Arena, though very preliminary, place Gemma 2 within LLama 3 confidence interval, so at worst they might be on par. Shockingly for Google and according to the same rumors, the model appears to be much less censored than both LLama 3 and the previous Gemma.
That of censorship and “safety”3 is another plague of the AI world, or to put it in Zuckerberg’s words: “a more mundane threat”. We will discuss this in a future analysis as well.
Microsoft released Phi3 as part of their Small LM series, also with excellent performance compared to its size. DeepSeek released DeepSeek V2, and developers are finally taking notice of its great performance in code writing (and not only that).
Open models are not “good for PR”; if anything, they might even represent a reputational risk (hence the existence of overly censored models). But strategically, they’re the only way to prevent Meta from being too successful. If you cannot beat them, fragment them.
A notable exception is OpenAI, which went in the opposite direction: they clammed up, releasing nothing at all, and moved on the diplomatic front, with Altman actively lobbying for regulations on AI, calling for increased scrutiny and government oversight. As OpenAI considers a shift to a for-profit structure, one has to wonder whether Altman is genuinely concerned for humanity’s future or if he’s positioning himself to be in the room when regulations against his own company will be discussed. With OpenAI as closed as it has become, I wouldn’t be surprised if his advice to members of congress and senators won’t fall in the direction of an open world of AI. And Sam is very good at pushing his doomsday agenda, apparently every second we are on the brink of total annihilation from one of his inventions that won’t be released because too dangerous for humankind.
The odd one is Mistral as it treads a fine line. Their playbook is very close to what Stability is doing today (with the difference that Stability is running the strategy as a way to attract a new user base): offering smaller high-quality models for free while providing high-performance, larger premium models via API for a price.
This would be theory, but lately Mistral has been lacking behind other LLMs, scoring within the Top-10 in Chatbot Arena only on French prompts. There might be local political reasons to bail them out, should it come to that, but the team has proven its worth, so I hope they’ll remain competitive.
The main difference between Mistral and LLama is that LLama is now its own brand that can count on vast resources and immediate reach to millions of developers, while Mistral is still the underdog, offering reasonably-quality LLMs to a much smaller niche of users.
Closing Lines
Meta needed a strategy to mitigate OpenAI’s momentum and to reduce their own internal costs, both in terms of infrastructure and human capital for R&D. The choice of going open is paying off. LLama enabled countless people and companies to develop their own solutions around a powerful LLM while remaining in full control of their own products.
Instead of playing the game of the conqueror, trying to capture as much market as possible, they decided to create competition for those companies attempting this exact strategy. After all, Meta can already count on a user base that is the majority of humanity, so their struggle is not user acquisition as much as it is monetization.
Investments in both open models and infrastructure show that they believe GenAI is going to be the future, and they want to be part of it. While they figure out the business side, they’re taking the lead in refining both the technology and cost of services as an essential part of offering these services at a profit.
OpenAI and Anthropic are left to fight and compete for the best tech, and chances are, the lead will be shifting from one to the other for a while. But, as mentioned, not everyone needs the best LLM every single time, so companies will be more than happy to cut their expenses—and reduce their data exposure—by paying an inference provider and switching to the major LLMs on a per-need basis.
See you next time!
I’m referring to LLama as an “open model” but there’s a finer distinction to be made: opensource vs open weights vs closed source. An opensource model comes with its source code, the model itself, weights and some times the dataset used, StableDiffusion is an example. Open weight models come with the model itself and weights, no source code and generally no dataset, LLama and Mistral fall into this category. Close source models don’t provide access to anything if not via an API, examples are GPT-4 and Claude.
Technically they could, Meta requires a license if LLama is offered commercially and the company running it has over 700M monthly active users. You can be sure this is not very profitable for them, it’s mostly intended to prevent companies like Amazon from profiting off Meta’s GPU time without paying a share.
“Safety” is quoted, as I mean all the alignment companies force on models to ensure things like “help me create an atomic bomb” and “how to hotwire a car” don’t receive an answer. Another example is the forced diversity issue with Gemini etc. I do agree on the mundanity of these concerns, as a quick Google search provides immediately a WikiHow answer that is not more dangerous than a response from any LLM could ever be.