Nvidia Unveils Vera Rubin, Boosting AI Power 4x
Nvidia has launched its Vera Rubin platform, featuring new Grok 3 LPUs, to significantly boost AI capabilities. The system offers up to a 35x improvement in throughput per megawatt and nearly 4x the AI performance in a single rack compared to previous generations. This advancement is set to accelerate the development of intelligent AI agents.
Nvidia Powers Next AI Era with Vera Rubin Systems
Nvidia has launched its next-generation AI hardware, the Vera Rubin platform, featuring new Grok 3 LPUs. This advanced system promises a significant leap in artificial intelligence capabilities, aiming to enhance training, fine-tuning, and inference for the world’s largest AI models.
Grok 3 LPUs Drive Inference Performance
The new Nvidia Grok 3 LX is designed as a rack-scale inference accelerator for the Vera Rubin platform. It works alongside NVL72 systems to boost performance, offering up to a 35x improvement in throughput per megawatt. This means faster processing and more intelligent results for complex AI tasks, handling trillion-parameter models and large input volumes.
Stuart Pittz, Senior Manager of Accelerated Computing Products at Nvidia, explained that the Grok chip creates a new category for AI inference. He believes this opens up a 10x revenue opportunity for companies offering premium AI services.
Vera Rubin Platform: A Modular Design Leap
The Vera Rubin platform introduces a more streamlined and efficient design compared to previous generations. A single compute tray now houses eight Grok 3 LPUs, an FPGA, a host CPU, and a BlueField-4 DPU. When scaled to a full rack of 32 trays, this provides 256 LPUs, delivering massive memory and scale-up bandwidth.
This modular design significantly improves serviceability. Disassembling a Vera Rubin tray now takes about 5 minutes, a drastic reduction from the 2 hours required for older systems. This speed boosts overall system uptime and efficiency, a metric Nvidia calls “goodput” – the percentage of time compute resources are actively delivering results.
The Role of CPUs in AI
Contrary to some expectations, CPUs are becoming more critical in AI workflows. The new Vera CPU, designed for the Vera Rubin platform, offers twice the performance per watt compared to its predecessor. It supports agentic workflows, tool calling, and reinforcement learning, ensuring that CPUs do not become a bottleneck as GPU performance increases.
Advancements in Cooling and Networking
The Vera Rubin platform is 100% liquid-cooled, a significant shift from previous air-cooled or hybrid systems. This advanced cooling allows for greater compute density and power efficiency, enabling more AI processing within the same energy budget. Nvidia is actively working with partners to help data centers transition to liquid cooling, offering reference architectures for both new and existing facilities.
Networking also sees a major upgrade with the MLink switch tray. This tray consolidates scale-up networking, providing immense bandwidth to connect GPUs efficiently. A full rack can achieve 260 terabytes per second of bandwidth, allowing data to move at speeds that can transmit the equivalent of the entire internet in about a second.
Seven Chips Powering Innovation
Nvidia’s strategy involves co-designing multiple chips to work together seamlessly. The Vera Rubin ecosystem includes GPUs (Rubin), CPUs (Vera), inference accelerators (Grok 3 LPU), FPGAs, network switch chips (MLink), Data Processing Units (BlueField-4), and previously announced CX9s. This integrated approach aims to maximize performance and efficiency across the entire system.
The BlueField-4 DPU, for example, manages network traffic and provides security isolation by separating data input/output from core processing. This ensures data integrity and enhances system security.
Rack-Scale Performance Gains
The Vera Rubin platform represents a substantial performance upgrade over the previous Blackwell generation. A single rack now delivers approximately 3.6 exaflops of AI performance, nearly a fourfold increase. Crucially, this performance boost comes with only a roughly 50% increase in power consumption, meaning power efficiency has significantly improved.
When paired with the Grok 3 LPX rack-scale accelerator, the NVL72 Vera Rubin system achieves up to a 35x improvement in token throughput per megawatt. This translates to handling much larger volumes of data and more complex requests with the same amount of energy.
Intelligence, Speed, and Throughput Combined
Nvidia highlights three key areas of improvement: speed, intelligence, and throughput. “Intelligence” refers to the ability to process larger models and longer context windows, leading to more accurate and nuanced AI outputs. The Vera Rubin platform and Grok LPUs are designed to excel in these areas, offering a significant memory bandwidth increase (nearly 40x) when paired together.
This combination allows for flexible optimization, whether prioritizing high throughput for cost efficiency or low latency for real-time interactions. It addresses the challenge of balancing speed with accuracy, ensuring AI agents are not just fast but also capable and reliable.
The Rise of Agentic AI
The future of AI is increasingly seen in agentic systems – AI that can perform tasks autonomously. These agents need to be fast, intelligent, and cost-effective to be practical. Nvidia’s new hardware is built to support this evolution, enabling AI to go beyond simple question-answering to actively optimizing tasks, executing complex workflows, and delivering tangible value.
Dion Harris, Senior Director of AI and HPC Infrastructure at Nvidia, expressed excitement about this shift, envisioning AI agents that can work tirelessly to improve productivity. Stuart Pittz added that the enhanced capabilities will empower individual developers to create sophisticated AI applications previously only imaginable on supercomputers.
Looking Ahead: AI Agents for Everyone
Nvidia sees this as a “big bang moment” for a new era of AI, making advanced AI agents accessible to everyone, not just developers. The Vera Rubin platform and Grok LPUs are foundational to unlocking these future capabilities, promising to reshape personal expectations for AI experiences across various industries, from robotics to image generation.
Source: E25: NVIDIA's 7 Breakthrough AI Chips Change Everything (YouTube)





