#17 Tech & Top Specs: Cerebras Systems

What if your AI computer wasn’t the size of a shoebox, but closer to a family-sized pizza? That’s the vibe at Cerebras Systems, a bold California startup that decided traditional chips were holding AI back—and built the world’s largest computer chip to prove it. With its flagship Wafer Scale Engine, Cerebras is slicing training times, turbocharging inference, and redefining what “scale” actually means. While the world obsesses over Nvidia and Google TPUs, Cerebras is quietly building AI supercomputers that could fit (almost) in your backpack. They’re not chasing AGI—they’re building the machinery that will make it possible. In this post, we dive into the silicon revolution you didn’t see coming… but definitely should.

Think Big—No, Bigger

Founded in 2016 by Andrew Feldman and Gary Lauterbach, Cerebras Systems was born from one deliciously rebellious idea: GPUs are great and all, but what if we stopped trying to scale up with more chips and just built one enormous chip instead?

Enter the Wafer Scale Engine (WSE)—the biggest computer chip ever made, by several country miles. We’re talking 46,000 square millimeters of silicon and 850,000 AI-optimized cores—compared to Nvidia’s H100 chip, which looks downright dainty in comparison.

Backed by investors like Benchmark, Altimeter, and the UAE’s G42, Cerebras has raised over $720 million in funding and is now valued north of $4 billion. Not bad for a company with fewer than 500 employees and a chip that literally breaks manufacturing norms.

But Cerebras isn’t just going big for the sake of flexing—they’re trying to solve a problem AI researchers constantly complain about: training large models takes forever and burns a ton of compute. Cerebras wants to fix that by shrinking time, space, and power needs—all at once.

What Cerebras Actually Does (Hint: It's Not Just "Big Chips")

At the heart of Cerebras’ product line is the WSE, now in its second generation: the WSE-2, which is packed into a server-sized AI system called the CS-2.

Let’s break it down:

Wafer Scale Engine (WSE-2):

  • 850,000 cores.
  • 2.6 trillion transistors.
  • 40 GB of ultra-fast on-chip memory.
  • Fabricated as one single wafer (which is insane, because wafers usually get chopped into dozens of individual chips).

CS-2 System:

  • A self-contained AI supercomputer.
  • Replaces entire racks of GPU-powered servers.
  • Handles model training and inference with dramatically lower latency and power draw.

What makes this wild contraption work is data locality. Traditional GPU clusters have to move information between separate chips, which causes slowdowns. Cerebras’ unified chip architecture keeps everything in the same brain—so models train faster, and more efficiently.

They’ve also built tools like Cerebras Software Platform (CSoft) and Weight Streaming, making it easy to port existing PyTorch/TensorFlow models without needing to completely rethink your code.

Why Cerebras Is Such a Game-Changer: David with a Giant Chip vs. Goliaths

Cerebras isn’t trying to be Nvidia 2.0—they’re carving out their own niche in the AI ecosystem by focusing on radical simplicity through extreme scale. And it’s working.

1. Speed

Training large language models (LLMs) on a CS-2 takes a fraction of the time compared to traditional GPU clusters. Their chip keeps everything “on-die,” which avoids the latency that kills performance in other systems.

2. Energy Efficiency

Less moving data = less wasted energy. Cerebras machines are more power-efficient, which in today’s energy-hungry AI landscape, is chef’s kiss.

3. Plug-and-Play for Supercomputers

Cerebras has partnered with the likes of Argonne National Lab, G42, and Leukemia & Lymphoma Society to run AI workloads in medicine, genomics, and high-performance scientific computing. Basically: they’re not just chasing chatbots—they’re saving lives and solving real-world problems.

4. Built for LLMs from Day One

Unlike legacy systems retrofitted for AI, Cerebras built their chips specifically for modern neural networks. That gives them a unique edge in the era of foundation models.

Real-World Impact: AI with a Side of Global Reach

In 2023, Cerebras signed a mega-deal with G42 in the UAE to deliver nine AI supercomputers, forming one of the world’s most powerful AI cloud infrastructures. And in late 2024, they announced a partnership to build a full-scale LLM training cloud in the Middle East, independent of traditional GPU supply chains.

They’re also working with pharmaceutical companies to accelerate drug discovery, using AI models to predict protein folding and simulate chemical interactions. You know, just casually reshaping medicine.

In short: Cerebras is what happens when you think big enough to change where and how AI happens.

Final Thoughts

While the AI world fixates on GPUs, APIs, and parameter counts, Cerebras is quietly building the infrastructure for the next generation of intelligence. They’re not just making chips; they’re making it possible to train LLMs in days, not weeks—and doing it in less space, with less energy, and less hassle.

Their bet? That hardware is the real bottleneck in AI—and that reimagining it could accelerate every other part of the field. So the next time you ask ChatGPT a question and wonder how it got so smart so fast… it just might be thanks to a chip the size of a dinner plate.

Stay curious, stay informed, and let´s keep exploring the fascinating world of AI together.

This post was written with the help of different AI tools.

Visit Cerebras Sytsems

Check out previous posts for more exiting insights!