The Nvidia tax has Silicon Valley hunting for more AI chips

The Nvidia tax has Silicon Valley hunting for more AI chips

The Nvidia tax has Silicon Valley hunting for more AI chips

A version of this article originally appeared in Quartz’s members-only Weekend Brief newsletter. Quartz members get access to exclusive newsletters and more.  Sign up here .

Sam Altman has been complaining for months about Nvidia chip shortages slowing ChatGPT releases. OpenAI’s new $10 billion partnership with Broadcom for custom chips shows the company is finally doing something about it.

The partnership, revealed earlier this month after Broadcom’s earnings call, underscores a growing reality: Companies are desperate to escape what the industry calls the “Nvidia tax” — the chip giant’s approximately 60% gross margins on processors that have become essential for AI development.

Altman has been vocal about GPU shortages bogging down ChatGPT releases, writing on X that the company was “out of GPUs” and needed to add “tens of thousands” more to roll out new features. But OpenAI isn’t alone in its frustration. Across Silicon Valley and beyond, a revolution is brewing in developing new chips to break free from Nvidia’s stranglehold. This is especially true for inference, which is when AI systems actually answer questions or create content for users.

While Nvidia chips will continue dominating AI training, more new inference chips could save companies tens of billions of dollars and reduce energy consumption significantly. The math is compelling: Inference happens every time someone asks ChatGPT a question or generates an image, making it far more frequent than the one-time training process.

The Broadcom chip isn’t designed to challenge Nvidia directly, according to the Wall Street Journal , but rather to “plug the gaps” in OpenAI’s hardware needs. This hybrid approach reflects the broader industry strategy — not necessarily replacing Nvidia entirely, but reducing dependency through specialized alternatives.

Companies like Positron claim their chips can deliver two to three times better performance per dollar and three to six times better energy efficiency than Nvidia’s next-generation systems. Groq, founded by Google’s former AI chip development head, claims its specialized chips can make ChatGPT run more than 13 times faster .

The big cloud providers aren’t waiting for startups to solve their Nvidia dependency. Google , Amazon , and Microsoft are all developing inference-focused chips for their internal AI tools and cloud services. These multi-year, well-funded efforts represent a direct challenge to Nvidia’s dominance in the inference market.

Even Qualcomm is returning to datacenter products after abandoning the server market in 2018. CEO Cristiano Amon recently teased plans focusing on “clusters of inference that is about high performance at very low power.”

Leave a Comment

Your email address will not be published. Required fields are marked *