Skip to content
Home » AWS and Anthropic launch ‘Project Rainier’, a new AI superpower

AWS and Anthropic launch ‘Project Rainier’, a new AI superpower

Amazon Web Services, in a direct collaboration with AI firm Anthropic, is building an enormous supercomputing cluster known as “Project Rainier.” This venture, backed by an $8 billion investment from Amazon into Anthropic, aims to provide unparalleled computational power for developing future AI. The system is already being brought online this year across several US locations, marking a significant escalation in the race for AI dominance.

Unprecedented Scale and Power

The sheer magnitude of Project Rainier is staggering. One site alone, located in Indiana, will eventually consist of thirty 200,000-square-foot data centers, which together will draw a massive 2.2 gigawatts of power. This sprawling, multi-site network is engineered to operate as a single, unified environment for training Anthropic’s most advanced AI models.

A portion of this powerful new infrastructure is already being utilized by the AI firm to gain a critical advantage in its development efforts.

Also Read: 9,000 Microsoft job cuts explained A calculated bet on AI

A Strategic Shift to Custom Silicon

Project Rainier distinguishes itself by moving away from the industry’s reliance on GPUs. The entire supercluster will be powered by Amazon’s own custom-designed Trainium2 AI accelerators, representing the largest-ever deployment of this proprietary technology.

Gadi Hutt, an engineering director at Amazon’s Annapurna Labs, noted that the focus extends beyond simple chip speed to achieving optimal “good put”—the real-world, effective throughput of the entire system, factoring in reliability and uptime.

The Trainium2 chip is a complex piece of hardware, combining two 5nm compute dies with high-bandwidth memory. It is specifically designed to handle both the training and inference stages of AI development, a vital capability for sophisticated methods like reinforcement learning.

While a single Trainium2 may not exceed the performance of a top-tier Nvidia B200 on every individual metric, Amazon’s strategy centers on the integrated system’s overall efficiency and cost-effectiveness.

An Architecture Built for the Future

The fundamental building block of Project Rainier is the Trn2 instance, each containing 16 Trainium2 accelerators. These instances offer a compelling alternative to competitor systems, especially for training tasks that involve sparse data.

Trn2DGX B200
CPUs:2x 48C Intel Sapphire Rapids2x 56C Intel Emerald Rapids
System Mem:2TB DDR5Up to 4TB
Accelerators:16x Trainium28x B200 GPUs
HBM:1536GB1440GB
Memory BW:46.4TB/s64TB/s
Dense FP8:20.8 petaFLOPS36 petaFLOPS
Sparse FP8:83.2 petaFLOPS72 petaFLOPS

Within each Trn2 instance, the accelerators are connected in a 4×4 2D torus via AWS’s high-speed NeuronLink v3. This switchless interconnect design minimizes latency and power usage, enabling the systems to be air-cooled.

Also Read: These popular Spotify artists don’t actually exist and fans had no idea

To achieve even greater scale, four Trn2 instances are combined into an “UltraServer,” creating a 64-chip domain with a 3D torus interconnect.

Trn2 UltraServerDGX GB200 NVL72
Accelerators:64x Trainium272x Blackwell GPUs
HBM:6.1TB13.4TB
Memory BW:186TB/s576TB/s
Interconnect BW:68TB/s130TB/s
Dense FP8:83.2 petaFLOPS360 petaFLOPS
Sparse FP8:332.8 petaFLOPS720 petaFLOPS

The complete “UltraCluster” will be formed by linking tens of thousands of these UltraServers with Amazon’s custom EFAv3 network. Looking forward, Amazon has already hinted at its next-generation Trainium3 chips, which promise a fourfold performance boost, suggesting that Project Rainier’s immense capabilities are set to grow even further.

Luna Awomi

Luna Awomi

Luna Awomi is a seasoned news writer with over five years of journalism experience. Driven by her passion for storytelling, she is currently pursuing a Master's in Journalism and Digital Media to further enhance her expertise.