If you thought a chip like AMD's MI300A was big at 146 billion transistors, you ain't seen nothing yet. AI company Cerebras announced its third-generation AI chip, CS-3, a "wafer-scale" silicon monstrosity designed for AI training. It's basically an entire TSMC 5nm wafer sold as a single chip. The CS-3 is 56 times the size of Nvidia'a H100 and features over 4 trillion transistors, making it the biggest "chip" in the world by a wide margin. The company says it is the world's fastest AI chip, breaking the record set by its predecessor, the CS-2.
Cerebras takes a different approach to AI training chips than Intel, AMD, and Nvidia. Instead of having TSMC make a wafer and then cutting it up into individual dies, Cerebras keeps the whole thing intact. This allows it to sell a wafer-scale chip, which offers 900,000 AI cores, 44GB of on-chip SRAM memory, and the ability to attach up to 1.2 petabytes of memory. Cerebras says CS-3 is capable of 125 exaFLOPS of compute performance when scaled up to 2,048 CS-3 systems. This compute monstrosity is theoretically capable of training AI models up to 10X larger than ChatGPT and Google's Gemini, and when scaled to the max, it can train a 70 billion parameter model like Llama 70B in a single day, says Cerebras.
According to Serve the Home, the Cerebras wafer-scale chip is the only successful AI chip from a startup so far, and this third version represents a node-shrink for the company from TSMC 7nm to 5nm. As such, CS-3 delivers increased density from the previous version but with the same power requirements. The actual size of it is a whopping 46225mm2, dwarfing all other chips on the market by a sizable margin. Significant power savings are achieved by keeping the wafer intact instead of slicing it up into dies, as there's not as much need for all those power-sapping interconnects when you have thousands of GPUs working in parallel. Cerebras says while its rivals (like Nvidia) are doubling power requirements each generation, its wafer-scale design increases density while keeping power usage the same.
The first CS-3 cluster will be operational in Dallas, TX, thanks to a partnership between Cerebras and G42. The two companies have announced the development of a new supercomputer named Condor Galaxy 3 that will sport 64 CS-3 systems for up to 58 million AI cores that will deliver 8 exaFLOPS in early 2024, then tens of exaFLOPS by the end of the year as the system comes online. The two companies are already operating AI supercomputers named Condor Galaxy 1 and 2, and they will connect them with the third version throughout 2024 utilizing the CS-3 wafer-scale chips. Cerebras says it's already backordered for its wafer-scale chips, so clearly, sufficient demand exists for such a novel design in the AI world.