By Insights Team in AI — 10 Jul 2025

Amazon Cloud Launches the Latest and Most Powerful GPU Instances Based on NVIDIA Blackwell}

Amazon Cloud introduces the new EC2 P6e-GB200 UltraServer with NVIDIA Blackwell GPUs, supporting large-scale AI models and accelerating AI development and deployment at unprecedented scale.

To accelerate the development of inference models, Agentic AI systems, and other emerging generative AI technologies, Amazon Cloud announced this week that the Amazon EC2 P6e-GB200 UltraServer, powered by NVIDIA Grace Blackwell Superchips, is now officially available. The P6e-GB200 UltraServer is designed for training and deploying the largest and most complex AI models.

Earlier this year, Amazon Cloud launched the Amazon EC2 P6-B200 instance based on NVIDIA Blackwell GPUs, supporting diverse AI and high-performance computing workloads.

Building on extensive experience with large-scale, secure GPU infrastructure, the Amazon EC2 P6e-GB200 UltraServer and Amazon P6-B200 instances help customers push the boundaries of AI technology.

Meeting Growing Computing Demands for AI Workloads

The Amazon EC2 P6e-GB200 UltraServer is Amazon’s most powerful GPU product to date, equipped with up to 72 NVIDIA Blackwell GPUs interconnected via fifth-generation NVIDIA NVLink, operating as a single compute unit. Each UltraServer offers up to 360 petaflops of high-density FP8 compute power and 13.4TB of high-bandwidth HBM3e memory—over 20 times the compute capability and 11 times the memory capacity of the P5en instance. It also supports up to 28.8 Tbps of network bandwidth via the fourth-generation Elastic Fabric Adapter (EFAv4).

Amazon states that the Amazon EC2 P6-B200 instance provides flexible options for a wide range of AI scenarios. Each instance features 8 NVIDIA Blackwell GPUs interconnected via NVLink, with 1.4TB high-bandwidth memory and up to 3.2 Tbps EFAv4 network bandwidth, powered by the latest Intel Xeon Scalable processors. Compared to the P5en, the P6-B200 offers 2.25 times the GPU compute power, 1.27 times the memory capacity, and 1.6 times the memory bandwidth.

Choosing between the P6e-GB200 and P6-B200 depends on specific workload needs and architecture preferences:

The P6e-GB200 UltraServer is ideal for compute- and memory-intensive AI workloads, such as training and deploying trillion-parameter models. Its NVIDIA GB200 NVL72 architecture excels at large-scale, coordinated multi-GPU operations, reducing communication overhead and boosting distributed training efficiency. It can host trillion-parameter models within a single NVLink domain, enabling faster, more stable responses in large-scale scenarios, especially when combined with NVIDIA Dynamo optimized inference techniques.
The P6-B200 supports a broad range of AI workloads, suitable for medium to large training and inference tasks. Its familiar 8-GPU configuration minimizes code modifications and simplifies migration from previous instances. Built on x86 architecture with Intel Xeon processors, it is well-suited for workloads optimized for x86 environments.

Continuous Innovation Based on Amazon Cloud’s Core Strengths

Introducing NVIDIA Blackwell into Amazon Cloud is not just a technological breakthrough but a comprehensive infrastructure innovation. Leveraging years of expertise in computing, networking, operations, and managed services, Amazon Cloud integrates all features of NVIDIA Blackwell while ensuring high reliability and performance.

Instance Security and Stability

Customer confidence in deploying GPU workloads on Amazon Cloud stems from its focus on security and stability. The Amazon Nitro system’s dedicated hardware, software, and firmware enforce strict isolation, preventing access to sensitive AI workloads and data—even by Amazon staff.

Beyond security, Nitro revolutionizes infrastructure maintenance by enabling real-time updates—handling network, storage, and other I/O functions without downtime. This capability is critical for high-availability AI production environments where interruptions can cause significant disruptions.

Both the P6e-GB200 UltraServer and P6-B200 instances feature the sixth-generation Nitro system. Since 2017, Nitro’s innovative architecture has continuously protected and optimized Amazon EC2 workloads.

Reliable Performance in Large-Scale Environments

The P6e-GB200 UltraServer is deployed in the third-generation Amazon EC2 UltraCluster, covering Amazon’s largest data centers. This architecture reduces power consumption by up to 40% and wiring needs by over 80%, significantly improving energy efficiency and reducing potential failure points.

To ensure consistent performance at scale, Amazon Cloud uses Elastic Fabric Adapter (EFA) and its Scalable Reliable Datagram protocol, which intelligently routes traffic across multiple network paths, maintaining stability even during congestion or failures. The fourth-generation EFA (EFAv4) in the P6e-GB200 and P6-B200 instances can deliver up to 18% higher collective communication speeds in distributed training compared to EFAv3 on the P5en instances.

Infrastructure Efficiency

The P6-B200 instance uses proven air-cooled architecture, while the P6e-GB200 UltraServer employs liquid cooling, enabling higher compute density and overall system performance. Its innovative liquid cooling design supports flexible chip-level cooling in new and existing data centers, reducing costs while boosting performance.

Enabling NVIDIA Blackwell on Amazon Cloud

Amazon Cloud simplifies deployment of the P6e-GB200 UltraServer and P6-B200 instances, allowing customers to quickly start using Blackwell GPUs while maintaining existing operations.

Amazon SageMaker HyperPod

For accelerated AI development with reduced infrastructure management, SageMaker HyperPod offers managed, reliable large GPU clusters that automatically configure and manage resources. It supports P6e-GB200 and P6-B200 instances, keeping workloads within the same NVLink domain for maximum performance. It also features a multi-layer recovery system, automatically replacing failed nodes with preconfigured backups, with comprehensive dashboards for resource monitoring.

Amazon EKS for Large-Scale AI Workloads

For users preferring Kubernetes, Amazon Elastic Kubernetes Service (EKS) offers features like EKS Hybrid Nodes, supporting local and EC2 GPU management within the same cluster. EKS manages the lifecycle of P6e-GB200 and P6-B200 instances, with topology-aware scheduling and automatic node configuration, providing high flexibility and reliability for AI workloads.

NVIDIA DGX Cloud on Amazon

The P6e-GB200 UltraServer is also available via NVIDIA DGX Cloud, a unified AI platform optimized for multi-node training and inference, integrated with NVIDIA’s full AI software stack. It offers flexible service options and expert support to accelerate AI projects.

This launch marks a significant milestone, but it’s just the beginning. As AI capabilities evolve rapidly, the infrastructure must support current needs and future possibilities. Amazon EC2 P6e-GB200 UltraServer and P6-B200 instances are fully prepared to meet these challenges, driven by continuous innovation across computing, networking, operations, and managed services. We look forward to seeing what customers will build next.

]