< BLOG HOME

Plug In, Deliver AI: Mirantis k0rdent AI Eliminates Infrastructure Complexity For Multi-Tenant GPU Clouds with NVIDIA NCX Infra Controller

Mirantis k0rdent AI and NVIDIA NCX Infra Controller

Managing GPU infrastructure in a multi-tenant production environment requires a level of automation, precision, security, and guaranteed performance that traditional infrastructure stacks simply were not designed to deliver. Operators need to manage bare-metal provisioning, DPU-enforced network isolation, multiple network fabric types, secure tenant sanitization, and full-stack lifecycle management, all done simultaneously, reliably, and for some providers across thousands of servers. However, most GPU cloud providers today still rely on a fragmented set of tools to handle provisioning, networking, storage, and monitoring.  Assembling a custom stack to fill the gaps is slow, risky, and hard to maintain and adapt as the hardware portfolio continues to evolve. 

To eliminate this management complexity and allow cloud providers to focus on delivering value to their customers, Mirantis offers a complete, full-stack AI cloud infrastructure platform: k0rdent AI

And now under the hood, k0rdent AI is powered by NCX Infra Controller, an open-source technology battle-tested inside NVIDIA to manage large-scale GPU fleets. By embedding this technology into our core, k0rdent AI provides a unified control plane that orchestrates the entire stack across the full NVIDIA reference architecture portfolio, from NVIDIA Ampere, NVIDIA Hopper, and NVIDIA Blackwell GPU series to NVIDIA Quantum InfiniBand and NVIDIA Spectrum-X Ethernet, NVIDIA NVLink network switches, and NVIDIA DGX, NVIDIA HGX, and NVIDIA MGX platforms. All from hardware lifecycle management through to AI services delivery. Using k0rdent AI, cloud providers can simply plug in NVIDIA certified hardware, and the platform takes care of the rest.

Zero-Trust Hardware Lifecycle Automation

k0rdent AI abstracts away the friction of bare-metal management, directly delivering:

  • Hardware-Enforced Tenant Isolation - By utilizing NVIDIA BlueField DPUs as an active enforcement layer, k0rdent AI can configure offloading networking operations and tenant isolation can be enforced at the hardware layer, preventing cross-tenant leakage even if a host is compromised.

  • Unified Network Fabric - k0rdent AI configures Ethernet, InfiniBand partitions, and NVLink. This ensures isolated, high-performance interconnect fabrics between tenant workloads.

  • Secure Tenant Management - Tenant Management and Multi-Cluster Kubernetes Provisioning become single-step operations driven by a powerful open-source software stack.

  • Metal-to-Model Operations - From OS provisioning to dynamic resource allocation and deploying NVIDIA NIM microservices for LLM serving, k0rdent AI builds the complete IaaS, PaaS, and AI services stack natively.

Mirantis k0rdent AI

Validated on NVIDIA GB200 NVL72 for Maximum Performance

Management challenges become even more complex with rack-scale systems like the NVIDIA GB200/GB300 NVL72, one of the most demanding platforms in the market today, which features 72 GPUs interconnected within a single rack via NVLink connectivity to leverage the full power of the technology. k0rdent AI solves this by configuring the unified network fabric, automatically handling Ethernet, InfiniBand partitions, and NVLink. This ensures isolated, high-performance interconnect fabrics between tenant workloads, allowing cloud providers to tame the complexity of rack-scale operations.

As part of the NVIDIA AI Cloud-Ready initiative, Mirantis validated k0rdent AI with NCX Infra Controller directly on the NVIDIA GB200 NVL72 reference architecture. Throughout testing, we proved that k0rdent AI preserves the full performance of the underlying hardware when deploying Kubernetes clusters. In virtualization tests, workloads on GB200 NVL72 ran with less than 0.1% performance degradation compared to bare metal, significantly outperforming NVIDIA's 5% maximum threshold.

To confirm the platform can perform at full capability under k0rdent AI management, Mirantis ran the NVIDIA HPC Benchmark Suite on a partial NVL72 rack (4x GB200 GPUs).

Note: This integration will be showcased at the Mirantis booth at GTC.

Category Metric Results per node Status
Compute cuBLAS FP32 (SGEMM) 72.7 TFLOPS/GPU
Compute TF32 Tensor Performance (HPL-MxP GEMM) 1,081 TFLOPS
Interconnect NVLink 5 GPU-to-GPU Bandwidth 765 GB/s
Interconnect NCCL AllReduce Peak 670 GB/s
Host-GPU Transfer Host-to-GPU Bandwidth 140 GB/s
AI Workload LLM Inference Throughput (Llama 3.3 70B, NVIDIA NIM) 7,141 tok/s

Test configuration: 4x NVIDIA Blackwell GPUs (partial NVL72 rack), 186GB HBM3e per GPU, NVLink 5, CUDA 13.0, TensorRT-LLM backend. Full validation report available upon request.

Focus on AI, Not Infrastructure

For service providers, building complex infrastructure stacks delays revenue. k0rdent AI provides a turnkey solution that supports the full spectrum of consumption models—bare metal, Kubernetes, VMs, and managed AI inference—within a single platform. Now with NVIDIA NCX Infra Controller inside, operators are assured day-zero readiness with NVIDIA certified models. Simply add hardware and begin serving tenants.

Learn More at NVIDIA GTC

Join one of our demos at Mirantis Booth 102 and register for the AI Executive Salon, an executive-only reception co-hosted by Mirantis, Netris, and Saturn Cloud.

Information is also available on our resources page

Kevin Kamel

Kevin Kamel is the Vice President of Product Management at Mirantis. Kevin is an innovative, data-driven executive with 20 years’ experience improving processes and procedures to drive revenue, efficiency, and market share. He has multi-disciplinary experience in all aspects of product development and management from requirements gathering through launch; user experience and customer support; and marketing functions, including product positioning, market trend research, and competitive analysis.

Mirantis simplifies Kubernetes.

From the world’s most popular Kubernetes IDE to fully managed services and training, we can help you at every step of your K8s journey.

Connect with a Mirantis expert to learn how we can help you.

CONTACT US
k8s-callout-bg.png