GPU Infrastructure's 15-Minute Miracle: When Complexity Meets Composability
)
Let's be honest for a second. If you've ever tried to stand up a production GPU cluster for AI workloads, you know the drill. It's a multi-week if not multi-month odyssey through dependency hell, where every framework wants its own special snowflake configuration, and by the time you're done, your best engineers have aged visibly.
At AI Infrastructure Field Day, we had the opportunity to show the industry a different way to build AI Infrastructure. When our CTO Shaun O'Meara claimed we could spin up a fully operational NVIDIA Run.ai inference cluster in about 15 minutes, the room full of infrastructure experts were understandably interested...
The Problem We're All Facing
Here's what we're seeing across the industry: companies are dumping millions into GPU hardware, expecting magic. What they're getting instead is a pile of expensive silicon that sits there while teams struggle to make it actually useful. The gap between "we bought GPUs" and "we're running production AI workloads" is measured in months, not minutes.
The secret? It's not the hardware that's hard. It's everything else…The orchestration, the multi-tenancy, the resource scheduling, the observability stack. Oh, and making sure your $40,000 GPU isn't sitting idle running someone's hello-world experiment.
Our VP of Product Management, Kevin Kamel, opened our presentation by addressing three brutal realities we hear from customers every day:
Converting single-tenant GPU hardware into multi-tenant services is a nightmare
The talent shortage means you can't just hire your way out of the problem
Everyone expects hyperscaler-level experiences now - self-service portals, integrated observability, efficient monetization
We've spent years building infrastructure for some of the world's largest clouds. From our early days as OpenStack pioneers to stewarding Kubernetes and acquiring Docker Enterprise and Lens, we've been in the trenches of infrastructure complexity. GPU infrastructure is just the latest chapter - but arguably a critical one.
Showing, Not Just Telling
Our Product Marketing Specialist, Anjelica Ambrosio, took the stage to prove our point. No pre-baked environments, no smoke and mirrors - just a real deployment from scratch. Using the Mirantis k0rdent AI Cloud Service Provider portal, she:
Used the Customer Portal to create an inference cluster template defining a new AI-optimized host cluster (and showed how this could be added to the customer’s Marketplace as a one-click option).
Used the CSP Operator Portal Product Builder to create a new service: building an AI host cluster with metrics onboard and integrated with the provider’s Grafana front end.
Used the IaaS Portal to demonstrate bare metal provisioning, cluster creation within a tenant, and showed dashboards displaying integrated Grafana monitoring for Kubernetes and for VMs hosted with KubeVirt.
Used the GPU PaaS portal to demonstrate deploying a complete inference cluster: selecting GPU nodes, configuring Kubernetes, adding Run.ai dependencies like ArgoCD and Knative, and closing by accessing the completed cluster and its running workloads through the automatically-integrated Run.ai webUI.
Fifteen minutes. That's all it took. And most of that (almost 14 minutes) was waiting for AWS machines to boot and downloading the configuration and credential information needed to manage them.
All those painful dependencies that normally take weeks to configure — cert-manager, GPU operators, Argo workflows — were automatically provisioned and configured. No manual YAML wrangling, no debugging version conflicts.
The Architecture Behind the Magic
Shaun walked through how we've organized the platform services layer above the GPU infrastructure. We didn't just automate the existing painful process - we fundamentally rethought it. Instead of forcing everyone to build from scratch, we've created composable service templates for training, inference, and data services.
The key insight? Services should be building blocks, not monoliths. They can be chained, extended, and validated without custom integration work for every new workload. When we demonstrated adding Run.ai to the cluster, it wasn't a special case requiring custom work - it was just another building block from our catalog.
Our labeling system automatically tags GPU nodes during cluster creation, and Run.ai validates these labels to ensure workloads land where they belong. GPU workloads on GPU nodes, everything else on CPU nodes. Simple? Yes. But it's the kind of simple that only comes from learning what breaks in production.
Answering the Hard Questions
The Field Day delegates had many questions, and that's exactly what we wanted. Here's what real operators care about:
"Can you mix frameworks - say Run.ai with Kubeflow - in the same deployment?"
Absolutely. Our catalog approach means you can compose based on what teams actually need. Today it's Run.ai, tomorrow add MLflow, next week swap in something else. No architectural redesign required. It's Lego blocks for AI infrastructure.
"What about sovereign clouds and air-gapped deployments?"
We shared the story of Nebul, a sovereign AI cloud in the Netherlands. They were drowning - managing thousands of Kubernetes clusters, enforcing strict multi-tenancy, dealing with stranded GPU resources. After adopting k0rdent AI, their small team could focus on business growth instead of infrastructure firefighting. And yes, it works completely disconnected from the internet.
"How do you handle the skills gap?"
We've been here before. During the early OpenStack era, we helped enterprises build private clouds when nobody knew how. Same playbook, different decade - we offer everything from managed services to skills transfer, meeting organizations wherever they are on the expertise spectrum.
Composability: The Real Game-Changer
We're not trying to build the One Platform to Rule Them All. We're building Lego blocks for GPU infrastructure.
Our Product Builder demo showed this philosophy in action. An operator can log into the self-service portal and within minutes:
Create new cluster products
Set parameters
Deploy to an internal marketplace
Monitor everything with real-time observability dashboards
This isn't just about deployment speed. It's about being able to evolve your AI infrastructure without starting from scratch every time requirements change.
Making GPU Infrastructure a Business Asset
Let's talk about what really matters to the business. Every minute your GPU cluster isn't running production workloads is money burning. Our platform doesn't just deploy infrastructure - it makes it billable.
Whether you need internal chargeback or want to sell services externally, k0rdent AI transforms racks of GPUs into metered AI services. We've built in flexible pricing models too:
OPEX consumption-based pricing for clouds that want to pay as they grow
CAPEX-aligned licensing for enterprises with budget constraints
FedRAMP support for government contracts
This isn't a science project; it's a business platform designed for real-world deployment.
Why We Built This
We've been building and operating infrastructure for over a decade. We've seen every "revolutionary" automation platform that works great in demos and falls apart in production. That's why we didn't just automate the existing painful process - we redesigned it from first principles.
GPU infrastructure doesn't have to be special. It doesn't need its own unique operational model that only three people in your organization understand. It can be as consumable as traditional compute - if you approach it right.
See It For Yourself
The entire AI Infrastructure Field Day presentation is available to watch, including the full demo and technical deep dives. We believe in showing our work, not just talking about it.
If you're sitting on GPU hardware wondering why it's so hard to make it useful, or if you're a service provider trying to compete with hyperscalers without their army of engineers, let's talk. The question isn't whether you need better GPU infrastructure automation - you do. The question is whether you want to spend the next six months building it yourself or fifteen minutes deploying something that already works.
GPU complexity doesn't have to be your reality. We've proven it can be solved. Now it's time to put that solution to work in your environment.
Recommended posts
)
Eliminate Dual Infrastructure Overhead with Mirantis k0rdent Enterprise to Unify VM and Container Management
READ NOW)
Experience k0s + k0rdent at KubeCon + CloudNativeCon North America 2025, and enter to win a skateboard!
READ NOW)



)
)