CoreWeave and Nvidia have completed system-level validation of the Nvidia Vera Rubin NVL72 architecture on CoreWeave Cloud, marking a significant advance in computing infrastructure designed specifically for autonomous AI agents. This development tackles the growing challenges posed by rapidly scaling AI workloads that require persistent reasoning, real-time operations, and unpredictable scaling capacities in production environments.
The Vera Rubin NVL72 platform features 72 Rubin GPUs, 36 Vera CPUs, and an extraordinary internal data bandwidth of 260 terabytes per second via NVLink 6—a throughput exceeding that of the entire global internet. This immense bandwidth capacity enables continuous agentic AI tasks such as code writing, experimentation, and multi-step reasoning loops to run seamlessly without interruption, fulfilling demands that traditional systems cannot meet.
CoreWeave’s approach to this architecture emphasizes more than just high GPU density. They integrate innovations such as liquid cooling—called Valvey—that operates with fine-grained, software-defined controls monitoring flow rate, temperature, pressure, and leak detection with sub-second responsiveness. This ensures system reliability and safety in intensive operational contexts.
The successful validation event also highlighted the importance of rack-level orchestration, enabling secure multi-tenant environments and high-performance networking tailored to support production-scale AI inference and agentic workloads. Collaboration with partners like Dell Technologies demonstrates a growing ecosystem dedicated to building infrastructure capable of supporting the next generation of AI applications that rely on sustained reasoning and agent autonomy.
CoreWeave executives emphasized that the Vera Rubin system is a foundational shift rather than an incremental upgrade, designed from the ground up for agentic AI’s unique workload profile, which demands continuous operation alongside dynamic scalability. This represents a decisive step in evolving computing architecture to address real-time, persistent AI-driven processes at scale on cloud platforms.

