WTF?! Few companies seem less likely to run out of AI capacity than Google and Meta, but even the industry's biggest names can hit a token wall. The search giant has reportedly limited the Facebook owner's use of Gemini after demand for AI capacity grew beyond what it could supply.

According to the Financial Times, Google warned Meta around March that it could not provide all the capacity the company wanted, disrupting and delaying internal AI projects.

The restrictions are still said to be in place. Meta has reportedly told employees to be more careful with AI tokens, the units used to measure model input, output, and usage. That's quite the change of tone for a company that has spent the past year pushing – and in some cases forcing – staff to use AI as much as possible.

Meta has spent billions building its own Llama family of open models, while Mark Zuckerberg has been pitching AI as the company's next defining platform, one Meta will hope does not go the same way as its metaverse bet.

But people familiar with the arrangement told the FT that Meta had been using Google's Gemini models for customer service, advertiser chatbots, coding, harmful content takedowns, and scam detection. Gemini was reportedly chosen because it performed better than Meta's own models. Anthropic's Claude is also said to be in the mix.

The shortage didn't hit only Meta; other Google customers were also affected, though less severely. Meta appears to have been the outlier because of the amount of Gemini capacity it wanted to buy.

The reliance is not entirely surprising. Meta doesn't operate a cloud business of its own, unlike Google, Microsoft, or Amazon, leaving it to balance its internal AI systems with outside capacity from the same companies it competes against.

Google has been spending heavily on data centers and AI hardware, but demand is still arriving faster than capacity can be built. Google Cloud revenue passed $20 billion in Alphabet's most recent quarter, while backlog nearly doubled to more than $460 billion. The company also said its first-party models were processing more than 16 billion tokens per minute through direct API use, up 60% from the previous quarter.

Meta is trying to solve the same problem, but from the other direction. It's been expanding data centers and working with Broadcom on custom MTIA accelerators as it looks to rely less on rivals.