Cerebras books a $20bn, 750MW OpenAI inference deal

Summary

Openai signed a multi-year deal with wafer-scale chipmaker Cerebras for 750MW of low-latency inference compute valued above $20bn, disclosed in Cerebras's first quarterly report as a public company. Cerebras posted $193.4m in Q1 revenue, up 94% from a year earlier, and added a parallel partnership with Amazon's AWS to distribute its CS-3 systems globally. The company raised $6.4bn in what it calls the largest semiconductor IPO ever. The deal diversifies OpenAI's compute away from Nvidia GPUs toward custom silicon built for inference speed.

Why it matters

OpenAI is spreading its compute book across vendors as inference, not training, becomes the cost center. Cerebras's single-chip design targets the latency that conventional GPU clusters struggle with, giving OpenAI faster real-time responses and giving a second-source supplier real scale against Nvidia.

What to watch

Whether the 750MW deploys on schedule given grid and power constraints.
How much of OpenAI's inference shifts off Nvidia hardware.
Whether Cerebras's revenue ramp justifies its post-IPO valuation.