Cerebras Systems

Cerebras builds wafer-scale AI processors enabling faster, more efficient machine learning.
Series H $3.77B total Founded 2015 Sunnyvale, California 788 employees
Cerebras Systems designs and manufactures wafer-scale AI processors (WSE-3) that are 57x larger than leading GPUs, enabling faster AI training and inference with dramatically reduced power consumption. The company offers both on-premise hardware systems and cloud-based AI services through its own data centers. Built by a team of veterans from SeaMicro and AMD, Cerebras solves the fundamental problem of data movement in AI workloads by keeping computation on-silicon, delivering measurable speed and efficiency advantages for large-scale machine learning.
Problem solved
Data movement between chip and memory is the bottleneck in AI training and inference, consuming power and increasing latency; enterprises need hardware designed specifically for AI workloads rather than adapting general-purpose GPUs.
Target customer
Enterprise organizations running large-scale AI training and inference workloads in pharma, life sciences, energy, and large-scale model training; also cloud customers seeking cost-effective inference services.
Founders
A
Andrew Feldman
CEO & Co-Founder
Co-founder and CEO of SeaMicro (acquired by AMD for $355M); Stanford BA in Economics and Political Science, Stanford MBA.
G
Gary Lauterbach
CTO & Co-Founder
Co-founder of SeaMicro with Feldman; pioneered low-power server technology.
S
Sean Lie
Chief Hardware Architect & Co-Founder
MIT BS and MS in Electrical Engineering and Computer Science; spent 5 years at AMD advanced architecture team, then SeaMicro lead hardware architect.
M
Michael James
Chief Software Architect & Co-Founder
Lead Software Architect at SeaMicro before joining Cerebras.
J
Jean-Philippe Fricker
Co-Founder
Senior Hardware Architect at DSSD (acquired by EMC), Lead System Architect at SeaMicro; holds 30 patents in computing systems.
Funding history
Series A $27M May 2016 Led by Foundation Capital, Eclipse Ventures
Series B $25M December 2016 Led by Benchmark Capital
Series C Unknown January 2017 Led by VY Capital
Series D $80M November 2018 Led by Coatue, VY Capital · Benchmark, Altimeter, Foundation Capital
Series E $272M Late 2019 Led by Unknown
Series F $250M November 2021 Led by Alpha Wave Ventures, Abu Dhabi Growth Fund
Series G $1.1B September 2025 Led by Fidelity Management & Research Company, Atreides Management · Tiger Global, Valor Equity Partners, 1789 Capital, Altimeter, Alpha Wave Global, Benchmark
Series H $1B February 2026 Led by Tiger Global · Benchmark, Fidelity, Atreides Management, Alpha Wave Global, Altimeter, AMD, Coatue, 1789 Capital
Total raised: $3.77B
Pricing
Hardware systems start at $1.5M+ per system; chips estimated at $2-3M. Cloud inference pricing: 10 cents per million tokens for Llama 3.1 8B, 60 cents per million tokens for Llama 3.1 70B.
Notable customers
GlaxoSmithKline (GSK)
Integrations
PyTorch, Llama 3.1, cloud deployment via Cerebras Cloud, on-premise hardware systems
Tech stack
React (JavaScript frameworks) Next.js (Web servers) dc.js (JavaScripty graphics) Radix UI (UI frameworks) Webpack Vercel Analytics (Analytics) Linkedin Insight Tag (Analytics) Google Analytics (Analytics) HSTS (Security) Microsoft 365 (Email) Cloudflare (CDN) cdnjs (CDN) DoubleClick Floodlight (Advertising) Stripe (Payment processors) Google Tag Manager (Tag managers) Vercel (PaaS) CookieYes (Cookie compliance) GoDaddy (Hosting)
Website
Competitors
NVIDIA
NVIDIA dominates with general-purpose GPUs optimized across use cases; Cerebras is purpose-built for large-scale AI training with greater on-chip memory and bandwidth but narrower market reach.
Intel
Intel focuses on CPUs and broader semiconductor categories; Cerebras specializes exclusively in wafer-scale AI processors designed from first principles for machine learning workloads.
Graphcore
Graphcore builds IPUs (Intelligence Processing Units) for AI; Cerebras differentiates via wafer-scale design with 52x more cores and 880x more on-chip memory than competing solutions.
Google TPU
Google TPUs are proprietary and available primarily through Google Cloud; Cerebras offers both on-premise and cloud options with greater flexibility and independence.
Why this matters: Cerebras has solved a 75-year-old chip design problem by building commercially viable wafer-scale processors, backed by $3.77B in funding and a founding team of proven hardware engineers from SeaMicro and AMD. The company is positioned at the intersection of exploding AI infrastructure demand and the limits of GPU-based approaches, making it a critical player in the future of AI hardware.
Best for: Enterprise organizations training large foundational models or running inference at scale, and pharma/life sciences companies accelerating research workflows where compute efficiency and speed deliver measurable ROI.
Use cases
Accelerated Drug Discovery
GlaxoSmithKline used Cerebras CS-1 to accelerate genetic and genomic research for neural networks that reduce drug discovery timelines. The larger on-chip memory allowed GSK researchers to increase model complexity without the data movement bottleneck that constrains GPU-based approaches.
Large Language Model Training
Companies training 100B+ parameter models can keep more intermediate computations on Cerebras' 44GB on-chip memory, eliminating costly memory bandwidth bottlenecks. This translates to measurably faster training iterations and lower power consumption than distributed GPU clusters.
Cost-Effective Cloud Inference
Cerebras Cloud offers inference at a fraction of GPU-based cloud pricing (10¢ per million tokens for 8B models). Enterprises running high-volume inference workloads can reduce per-token costs while maintaining low latency.
Alternatives
NVIDIA GPU clusters Most widely adopted, vendor-neutral software ecosystem, but require distributed training complexity and higher memory bandwidth costs compared to wafer-scale integration.
Graphcore IPU Smaller form factor with different architectural tradeoffs; choose when distributed training flexibility matters more than single-system memory density.
Google Cloud TPU Proprietary, cloud-only; choose only if already locked into Google Cloud ecosystem and don't need on-premise flexibility.
FAQ
What does Cerebras do? +
Cerebras designs and manufactures wafer-scale AI processors (WSE-3) specifically optimized for machine learning training and inference. The company also operates cloud AI services and sells on-premise hardware systems. Its key innovation is solving the data-movement bottleneck in AI by fitting far more compute and memory on a single chip than competing solutions.
How much does Cerebras cost? +
On-premise systems start at $1.5M+ per unit, with individual chips estimated at $2-3M. Cloud inference pricing is 10 cents per million tokens for Llama 3.1 8B and 60 cents per million tokens for Llama 3.1 70B. Enterprise customers should contact sales for custom pricing.
What are alternatives to Cerebras? +
NVIDIA GPU clusters (most widely adopted, distributed training), Graphcore IPUs (smaller form factor, different architectural approach), and Google Cloud TPUs (proprietary, cloud-only). NVIDIA dominates the market but Cerebras is purpose-built for large-scale AI with greater on-chip memory density.
Who uses Cerebras? +
Enterprise customers running large-scale AI training and inference, particularly in pharma and life sciences. GlaxoSmithKline publicly uses Cerebras for drug discovery acceleration. Cloud customers seeking cost-effective inference also use Cerebras Cloud services.
How does Cerebras compare to NVIDIA? +
NVIDIA GPUs are general-purpose and widely adopted with mature software ecosystems; Cerebras is purpose-built for AI with 57x larger chip size, 52x more cores, and 880x more on-chip memory bandwidth. Cerebras excels at large models and training but has narrower software compatibility and smaller market adoption. Choose Cerebras for maximum performance and efficiency on big training jobs; choose NVIDIA for flexibility and ecosystem breadth.
Tags
AI hardware wafer-scale processors machine learning GPU alternative AI training inference acceleration semiconductor