NVIDIA Unveils Vera Rubin Platform to Power Next-Gen AI
Home > AI, Cloud & Data > Article

NVIDIA Unveils Vera Rubin Platform to Power Next-Gen AI

Photo by:   NVIDIA Newsroom
Share it!
Diego Valverde By Diego Valverde | Journalist & Industry Analyst - Tue, 01/06/2026 - 11:40

NVIDIA launched the Vera Rubin platform at 2026 CES, introducing six new chips designed to power the next generation of AI supercomputers. The architecture aims to reduce training times and decrease inference token costs for complex AI models.

The transition toward agentic reasoning requires a shift in data center architecture to manage increasing computational demands. "Rubin arrives at exactly the right moment, as AI computing demand for both training and inference is going through the roof," says Jensen Huang, Founder, CEO, and President, NVIDIA.

The global market for data center infrastructure is undergoing a fundamental transformation as research and development budgets shift from classical computing to AI. According to research conducted by McKinsey and Company, organizations are expected to invest nearly US$7 trillion in data center infrastructure globally by 2030. This investment surge is driven by the industry’s reliance on NVIDIA technology to support increasingly demanding AI models.

Modern AI applications are evolving from simple chatbots to sophisticated agents capable of multi-step reasoning. These workloads require infrastructure that can handle massive token volumes and long sequences of data. Consequently, the industry bottleneck is shifting from raw computer power to context management and storage efficiency. Vera Rubin platform addresses these challenges by integrating advanced networking, processing, and memory technologies into a unified architecture, says NVIDIA.

Rubin Design

The Vera Rubin platform utilizes extreme codesign across six specialized components: the NVIDIA Vera CPU, Rubin GPU, NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet switch. According to NVIDIA, this integration delivers up to a 10-fold reduction in inference token costs. Additionally, the platform requires four times fewer GPUs to train mixture-of-experts (MoE) models compared to the previous Blackwell platform.

At the core of the rack-scale solution is the NVIDIA Vera Rubin NVL72. This system combines 36 Vera CPUs and 72 Rubin GPUs through the sixth-generation NVLink interconnect. Each GPU provides 3.6 TB/s of bandwidth, while the full NVL72 rack delivers 260 TB/s. This interconnectivity is essential for seamless communication in massive MoE models. The Vera CPU, built with 88 custom Olympus cores, provides the energy efficiency required for large-scale AI factories and is fully compatible with Armv9.2 technology.

NVIDIA introduced the Inference Context Memory Storage Platform to scale inference context at a gigascale. This AI-native storage infrastructure is powered by the BlueField-4 storage processor, which enables the efficient sharing and reuse of key-value cache data across infrastructure. This approach improves throughput for agentic reasoning. BlueField-4 also features the Advanced Secure Trusted Resource Architecture, or ASTRA, providing a single control point to secure and isolate large-scale AI environments.

Networking advances include the Spectrum-6 Ethernet architecture, which is designed for AI factories requiring high resilience. The Spectrum-X Ethernet Photonics systems use co-packaged optics to achieve five times better power efficiency and 10 times greater reliability than traditional methods. Furthermore, the Spectrum-XGS technology allows facilities separated by hundreds of kilometers to function as a single, synchronized AI environment.

Major cloud service providers have confirmed plans to deploy Rubin-based instances in 2H26. Satya Nadella, Executive Chairman and CEO, Microsoft, says his company will deploy Vera Rubin NVL72 systems as part of its Fairwater AI superfactories. Other early adopters include Amazon Web Services (AWS), Google Cloud, and Oracle Cloud Infrastructure. Additionally, CoreWeave, Lambda, Nebius, and Nscale will offer Rubin-based instances to support frontier model training.

Infrastructure partners such as Dell Technologies, HPE, Lenovo, Cisco, and Supermicro will deliver a range of servers based on these new products. To support the software stack, Red Hat expanded its collaboration with NVIDIA to offer Red Hat Enterprise Linux and OpenShift optimized for the Rubin platform. These tools are used by most Fortune Global 500 companies to manage hybrid cloud environments.

The Rubin platform represents the third generation of rack-scale architecture from NVIDIA. By providing a complete stack that includes silicon, software, and networking, the company aims to establish a standard for the next decade of AI development. Production is currently underway, and the first commercial products are expected to reach the market between July and December 2026.

Photo by:   NVIDIA Newsroom

You May Like

Most popular

Newsletter