Microsoft Azure Deploys World's First NVIDIA GB300 Supercomputer for OpenAI
Microsoft Azure Deploys World's First NVIDIA GB300 Supercomputer for OpenAI

Microsoft Azure Deploys World’s First NVIDIA GB300 Supercomputer for OpenAI

The new cluster, featuring over 4,600 NVIDIA Blackwell Ultra GPUs, is designed to power the next generation of large-scale AI models and significantly reduce training times.


REDMOND, Wash. – Microsoft Azure announced today the launch of the world’s first supercomputing-scale production cluster using NVIDIA’s GB300 NVL72 systems. The new state-of-the-art platform, purpose-built for its partner OpenAI, is designed to handle the most demanding AI workloads and accelerate the development of next-generation AI models.

Server blade from a rack featuring NVIDIA GB300 NVL72 in Azure AI infrastructure.
Server blade from a rack featuring NVIDIA GB300 NVL72 in Azure AI infrastructure.

The massive cluster features over 4,600 NVIDIA Blackwell Ultra GPUs interconnected by NVIDIA’s Quantum-X800 InfiniBand networking. Microsoft stated this is the first of many such deployments, with plans to scale to hundreds of thousands of Blackwell GPUs across its global data centers. The company estimates this new infrastructure will enable AI model training in weeks instead of months.

“Delivering the industry’s first at-scale NVIDIA GB300 NVL72 production cluster for frontier AI is an achievement that goes beyond powerful silicon — it reflects Microsoft Azure and NVIDIA’s shared commitment to optimize all parts of the modern AI data center,” said Nidhi Chappell, corporate vice president of Microsoft Azure AI Infrastructure.


A New Standard in AI Performance

At the heart of the new cluster is the liquid-cooled, rack-scale NVIDIA GB300 NVL72 system. Each unit integrates 72 Blackwell Ultra GPUs and 36 NVIDIA Grace CPUs, creating a single, cohesive accelerator.

Close up of Azure server featuring NVIDIA GB300 NVL72, with Blackwell Ultra GPUs.
Close up of Azure server featuring NVIDIA GB300 NVL72, with Blackwell Ultra GPUs.

Key specifications for the new NDv6 GB300 VM series include:

  • Massive Memory: 37 terabytes (TB) of fast memory.
  • Extreme Performance: Up to 1.44 exaflops of FP4 Tensor Core performance.
  • High-Speed Networking: 800 gigabits per second (Gb/s) per GPU for cross-rack communication and 130 terabytes per second (TB/s) of bandwidth within each rack via NVIDIA NVLink.

This unified memory and compute architecture is essential for developing and deploying complex reasoning models, agentic AI systems, and multimodal generative AI.


Reengineering the Data Center for AI

Connecting over 4,600 GPUs into a single supercomputer required a fundamental redesign of the data center infrastructure. Microsoft detailed a years-long collaboration with NVIDIA to co-engineer every layer of the stack, from custom liquid cooling and power distribution systems to a reengineered software stack for storage and orchestration.

The cluster’s networking relies on the NVIDIA Quantum-X800 InfiniBand platform to scale communication across all GPUs, ensuring minimal overhead for large-scale training.

Server blade from a rack featuring NVIDIA GB300 NVL72 in Azure AI infrastructure.
Server blade from a rack featuring NVIDIA GB300 NVL72 in Azure AI infrastructure.

“This co-engineered system delivers the world’s first at-scale GB300 production cluster, providing the supercomputing engine needed for OpenAI to serve multitrillion-parameter models,” said Ian Buck, Vice President of Hyperscale and High-performance Computing at NVIDIA. “This sets the definitive new standard for accelerated computing.”

This milestone represents a significant step in building the infrastructure necessary to unlock the future of AI, solidifying the platform for OpenAI and other customers to push the boundaries of artificial intelligence.