Plug In, Deliver AI: Mirantis k0rdent AI Eliminates Infrastructure Complexity For Multi-Tenant GPU Clouds with NVIDIA NCX Infra Controller
)
Managing GPU infrastructure in a multi-tenant production environment requires a level of automation, precision, security, and guaranteed performance that traditional infrastructure stacks simply were not designed to deliver. Operators need to manage bare-metal provisioning, DPU-enforced network isolation, multiple network fabric types, secure tenant sanitization, and full-stack lifecycle management, all done simultaneously, reliably, and for some providers across thousands of servers. However, most GPU cloud providers today still rely on a fragmented set of tools to handle provisioning, networking, storage, and monitoring. Assembling a custom stack to fill the gaps is slow, risky, and hard to maintain and adapt as the hardware portfolio continues to evolve.
To eliminate this management complexity and allow cloud providers to focus on delivering value to their customers, Mirantis offers a complete, full-stack AI cloud infrastructure platform: k0rdent AI
And now under the hood, k0rdent AI is powered by NCX Infra Controller, an open-source technology battle-tested inside NVIDIA to manage large-scale GPU fleets. By embedding this technology into our core, k0rdent AI provides a unified control plane that orchestrates the entire stack across the full NVIDIA reference architecture portfolio, from NVIDIA Ampere, NVIDIA Hopper, and NVIDIA Blackwell GPU series to NVIDIA Quantum InfiniBand and NVIDIA Spectrum-X Ethernet, NVIDIA NVLink network switches, and NVIDIA DGX, NVIDIA HGX, and NVIDIA MGX platforms. All from hardware lifecycle management through to AI services delivery. Using k0rdent AI, cloud providers can simply plug in NVIDIA certified hardware, and the platform takes care of the rest.
Zero-Trust Hardware Lifecycle Automation
k0rdent AI abstracts away the friction of bare-metal management, directly delivering:
Hardware-Enforced Tenant Isolation - By utilizing NVIDIA BlueField DPUs as an active enforcement layer, k0rdent AI can configure offloading networking operations and tenant isolation can be enforced at the hardware layer, preventing cross-tenant leakage even if a host is compromised.
Unified Network Fabric - k0rdent AI configures Ethernet, InfiniBand partitions, and NVLink. This ensures isolated, high-performance interconnect fabrics between tenant workloads.
Secure Tenant Management - Tenant Management and Multi-Cluster Kubernetes Provisioning become single-step operations driven by a powerful open-source software stack.
Metal-to-Model Operations - From OS provisioning to dynamic resource allocation and deploying NVIDIA NIM microservices for LLM serving, k0rdent AI builds the complete IaaS, PaaS, and AI services stack natively.

Validated on NVIDIA GB200 NVL72 for Maximum Performance
Management challenges become even more complex with rack-scale systems like the NVIDIA GB200/GB300 NVL72, one of the most demanding platforms in the market today, which features 72 GPUs interconnected within a single rack via NVLink connectivity to leverage the full power of the technology. k0rdent AI solves this by configuring the unified network fabric, automatically handling Ethernet, InfiniBand partitions, and NVLink. This ensures isolated, high-performance interconnect fabrics between tenant workloads, allowing cloud providers to tame the complexity of rack-scale operations.
As part of the NVIDIA AI Cloud-Ready initiative, Mirantis validated k0rdent AI with NCX Infra Controller directly on the NVIDIA GB200 NVL72 reference architecture. Throughout testing, we proved that k0rdent AI preserves the full performance of the underlying hardware when deploying Kubernetes clusters. In virtualization tests, workloads on GB200 NVL72 ran with less than 0.1% performance degradation compared to bare metal, significantly outperforming NVIDIA's 5% maximum threshold.
To confirm the platform can perform at full capability under k0rdent AI management, Mirantis ran the NVIDIA HPC Benchmark Suite on a partial NVL72 rack (4x GB200 GPUs).
Note: This integration will be showcased at the Mirantis booth at GTC.
| Category | Metric | Results per node | Status |
| Compute | cuBLAS FP32 (SGEMM) | 72.7 TFLOPS/GPU | ✓ |
| Compute | TF32 Tensor Performance (HPL-MxP GEMM) | 1,081 TFLOPS | ✓ |
| Interconnect | NVLink 5 GPU-to-GPU Bandwidth | 765 GB/s | ✓ |
| Interconnect | NCCL AllReduce Peak | 670 GB/s | ✓ |
| Host-GPU Transfer | Host-to-GPU Bandwidth | 140 GB/s | ✓ |
| AI Workload | LLM Inference Throughput (Llama 3.3 70B, NVIDIA NIM) | 7,141 tok/s | ✓ |
Test configuration: 4x NVIDIA Blackwell GPUs (partial NVL72 rack), 186GB HBM3e per GPU, NVLink 5, CUDA 13.0, TensorRT-LLM backend. Full validation report available upon request.
Focus on AI, Not Infrastructure
For service providers, building complex infrastructure stacks delays revenue. k0rdent AI provides a turnkey solution that supports the full spectrum of consumption models—bare metal, Kubernetes, VMs, and managed AI inference—within a single platform. Now with NVIDIA NCX Infra Controller inside, operators are assured day-zero readiness with NVIDIA certified models. Simply add hardware and begin serving tenants.
Learn More at NVIDIA GTC
Join one of our demos at Mirantis Booth 102 and register for the AI Executive Salon, an executive-only reception co-hosted by Mirantis, Netris, and Saturn Cloud.
Information is also available on our resources page.

)
)
)

)
)
