Key Highlights
- Cerebras’s Wafer-Scale Engine processors will power AI inference workloads across AWS infrastructure.
- The partnership spans multiple years, though specific financial details remain confidential.
- Performance benchmarks show Cerebras chips deliver inference processing speeds up to 25x faster than competing GPU solutions.
- A separate $10+ billion agreement between OpenAI and Cerebras was finalized in January 2026.
- February 2026 saw Cerebras secure $1 billion in funding, pushing company valuation to approximately $23 billion.
Amazon Web Services has formalized a long-term collaboration with semiconductor innovator Cerebras Systems, bringing Wafer-Scale Engine technology into AWS cloud infrastructure. These advanced processors will power AI inference operations—the computational stage where trained models generate responses to user prompts.
As the dominant force in global cloud computing, AWS has traditionally relied on proprietary silicon solutions. The company’s Trainium chips, developed through its Annapurna Labs semiconductor division, have formed the backbone of its AI infrastructure. This latest agreement will see AWS integrate Cerebras technology alongside Trainium to create enhanced inference capabilities.
According to Cerebras, its Wafer-Scale Engine technology delivers superior performance during the decode phase—where AI models construct their actual outputs—achieving speeds up to 25 times greater than Nvidia’s GPU offerings.
AWS will market this capability as a premium-tier service. Cerebras CEO Andrew Feldman explained the positioning: “If you want slow inference, there will be cheaper ways to go.” AWS representatives confirmed that budget-conscious customers can still access more economical inference options powered exclusively by Trainium processors.
Cerebras Gains Momentum in AI Market
The AWS announcement follows closely behind another landmark agreement with OpenAI, signed in January 2026. That partnership, valued at over $10 billion, will supply processing power for ChatGPT operations using Cerebras hardware. OpenAI’s deployment plans include scaling up to 750 megawatts of computational capacity.
Cerebras strengthened its financial position through a February 2026 funding round that brought in $1 billion from investors. The round elevated total capital raised to $2.6 billion and assigned the company a valuation near $23 billion. Major institutional backers include Fidelity Management, Benchmark, Tiger Global, and Coatue.
The company previously pursued public markets, submitting IPO paperwork in September 2024, though it ultimately pulled that filing approximately twelve months afterward.
Competitive Dynamics in AI Processing
The collaboration between AWS and Cerebras represents another competitive development for Nvidia in the inference computing segment. Industry focus has been transitioning from training-focused workloads—where Nvidia maintains market leadership—toward inference applications that prioritize response speed.
Nvidia has responded with strategic moves of its own. A $20 billion licensing partnership with chip designer Groq was announced in December 2025. The company has also signaled intentions to launch a new processing architecture incorporating Groq’s technological innovations.
Nafea Bshara, Annapurna Labs co-founder and AWS executive, emphasized the partnership’s objectives around performance and economics: “Our job is to push the speed and lower the price.”
Shares of Amazon (AMZN) declined 0.44% at the time of publication.

