Cerebras CEO Andrew Feldman explains world’s largest AI chip
The company behind the Wafer-Scale Engine went from silicon moonshot to the biggest semiconductor IPO on record.
Most chipmakers cut silicon wafers into hundreds of individual chips. Andrew Feldman looked at that process and essentially asked: what if we just… didn’t?
That question became Cerebras Systems, the Sunnyvale, California-based company that builds the world’s largest AI chip. The Wafer-Scale Engine 3, or WSE-3, uses an entire silicon wafer as a single processor instead of dicing it into smaller pieces. It’s the kind of idea that sounds like it shouldn’t work, which is probably why nobody tried it seriously until Feldman’s team did.
The chip that ate the wafer
Here’s the thing about conventional AI chips, including Nvidia’s industry-standard GPUs. They’re powerful, but they’re small. A typical high-end GPU die might be the size of a postage stamp. The WSE-3, by contrast, occupies an entire 300mm wafer, the same dinner-plate-sized disc that foundries normally slice into dozens or hundreds of separate processors.
The result is a chip that integrates tens of millions of AI-optimized cores on a single piece of silicon. In English: instead of networking hundreds of small chips together and dealing with the communication overhead between them, Cerebras puts everything on one surface where data can move between cores with extremely low latency.
That architectural choice delivers real performance advantages. The WSE-3 offers significantly higher memory bandwidth than leading GPUs, which matters enormously for the memory-hungry workloads that define modern AI. Cerebras claims performance improvements of up to 15 times compared to traditional GPU setups for certain AI tasks.
The company packages the WSE-3 into the CS-3, its AI supercomputer system, and also offers cloud services for customers who would rather rent the horsepower than buy the hardware. Those cloud services now handle trillions of inference tokens per month, a figure that speaks to just how much demand exists for fast AI processing at scale.
From startup to record-breaking IPO
Cerebras didn’t just build a technically impressive chip. It built a business around it, and that business trajectory has been steep.
In September 2025, the company closed a $1.1B Series G funding round that valued it at $8.1B. That’s serious money, but it was just the warm-up act.
On May 13, 2026, Cerebras completed its IPO at $185 per share, raising $5.55B. That made it the largest US tech IPO since Uber and the biggest semiconductor debut on record. Shortly after trading commenced, the company’s market valuation reached approximately $95B.
To put that in perspective, Cerebras went from an $8.1B private valuation to roughly $95B in public markets in about eight months. Investors were clearly betting that the wafer-scale approach isn’t just a novelty but a genuine architectural advantage in the AI infrastructure race.
The IPO also positioned Cerebras as one of the few pure-play AI chip companies trading publicly, alongside more established names in the semiconductor space. For a company challenging the conventional wisdom of how chips should be manufactured, that’s a remarkable validation from the capital markets.
Why this matters beyond the chip industry
Feldman has consistently framed Cerebras as an inference and training platform, not just a hardware company. The distinction matters because the AI industry is rapidly shifting from a training-dominated phase, where the biggest models get built, to an inference-dominated phase, where those models get deployed at scale to millions of users.
Inference is where the economics get brutal. Every chatbot response, every AI-generated image, every automated code suggestion requires inference compute. The companies that can deliver inference tokens fastest and cheapest will capture an outsized share of that market. Cerebras positioning itself as a fast inference platform with trillions of tokens per month in throughput is a direct play for that opportunity.
The company’s customer base spans multiple sectors, including traditional finance and Web3. For crypto-adjacent applications, fast inference matters for everything from on-chain analytics to AI agents that interact with decentralized protocols. The infrastructure layer that powers AI is increasingly relevant to how blockchain-based systems evolve.
Look, Nvidia still dominates AI compute with a market share that would make any monopolist jealous. The CUDA software ecosystem alone creates switching costs that keep customers locked in. Cerebras is betting that raw architectural advantages, particularly in memory bandwidth and latency, can peel away workloads where GPUs aren’t the optimal solution.
That’s not a trivial bet. But a $95B market cap suggests the public markets think it’s at least a plausible one. For investors watching the AI infrastructure space, Cerebras represents the clearest test case for whether wafer-scale computing can move from engineering curiosity to industry standard. The trillions-of-tokens-per-month figure will be the metric to watch. If it keeps climbing, Feldman’s contrarian approach to chip design might end up looking less like a moonshot and more like the obvious move everyone else missed.
Earn with Nexo