< BLOG HOME

AI Means Big Changes To IT Infrastructure

AI Means Big Changes

This article originally appeared here as part of the Forbes Technology Council.

As the world races to deploy increasingly powerful AI models, the limitations of existing technology infrastructure are becoming evident. Today’s public clouds and colocation facilities weren’t built for 40-billion parameter models, 100 kilowatt-per-rack densities or real-time inference for billions of users.

High-performance computing (HPC)—also called neoclouds—is emerging, and it's purpose-built for AI. The hope is that this can helpenterprises avoid retrofitting existing data centers at high cost, which often stretches facilities beyond their intended design life.

Why Existing Infrastructure Can’t Handle AI Workloads

Most colocation facilities and first-generation cloud platforms were built to serve generalized enterprise computing: websites, single-server workloads and client/server apps, requiring only hyperconverged compute, solid-state storage devices (SSDs) and 25 Gb or 40 Gb interconnects. Legacy data centers often cap rack density at 8-10 kW with power, cooling and backup scaled accordingly. For traditional workloads, these limits were reasonable. For AI, they are a hard stop.

How AI Changes Everything

AI workloads—especially model training and real-time inference—reshape data center design in the following fundamental ways:

Huge Power Density Requirements

A DGX H100 system, per NVIDIA specifications, draws 10.2 kW. NVIDIA recommends that a rack of four (drawing 20.4 kW) be provided with 32.7 kW of circuit capacity. This density rivals the total power consumption of some small data centers.

New Power Distribution, Backup And Cooling

Densities this high require new power distribution units (PDUs), backup systems and cooling.Between 2017 and 2020, average rack densities in data centers rose from 5.6 kW to 8.4 kW. And while ‘extreme density’ racks were considered to be more than 15 kW not even a decade ago, the higher limits of rack density that support HPC applications like GenAI now reach 200 kW per rack. Traditional air cooling often cannot cope; liquid or hybrid approaches are increasingly discussed.

New Bandwidth And Networking

AI workloads demand 100-400 Gbps interconnects, typically InfiniBand or RoCEv2 with remote direct memory access (RDMA), bypassing CPUs for direct GPU-to-GPU transfers. This requires new cabling, switch fabrics and careful topology planning.

Infrastructure must be tuned end-to-end: airflow paths, rack spacing, cable routing and even floor loading to prevent GPU vibration failures.

Data Gravity Reverses The Cloud Model

Traditional cloud sent data to compute. AI inverts this.

Training of foundation models requires petabytes of proprietary data—customer interactions, sensor logs, internal documents and R&D archives. Moving that much is slow, costly and insecure. Compute must go where data resides—colocated with enterprise stores or at the edge. Increasingly, that means blending centralized capacity with distributed nodes close to data sources.

New GPU Economics

GPUs, central to AI computation, are in constrained supply. NVIDIA allocates chips to customers who commit volume and multi-year deals. Smaller firms often buy at high prices through distributors.

Neoclouds aggregate demand, pool access and provide fractional GPUs, enabling broader participation. This shifts the advantage toward providers that can secure steady allocations and spread them efficiently.

Training Versus Inference: The Next Flip

Most infrastructure is still optimized for training—long jobs where efficiency per watt matters more than responsiveness. But the balance is shifting. Analysts note that training still dominates, but inference workloads are growing quickly and may soon surpass training.

Infrastructure must support both. Systems designed only for training will lag. Edge computing places resources closer to users and data, improving latency, resilience and customer experience.

What Enterprises Can Do Now

How can cloud service providers and enterprises meet these challenges?

Treat AI infrastructure as a portfolio.

Enterprises must balance training (building new models), fine-tuning (adapting open-source models) and inference (applications that deliver business results). For many enterprises, inference will dominate, often supported by vector databases and retrieval-augmented generation (RAG) for domain expertise.

Define service-level objectives (SLOs).

For each portfolio use-case (training, fine-tuning, inference), set requirements for latency, throughput, availability and data residency. Use these to sequence investments and select cost-effective placements.

Build comparative TCO/ROI models.

Evaluate retrofitting facilities, procuring from colocation or neocloud providers, or renting cloud GPU time. Model capex (build and upgrade costs) against opex (power, cooling, support). Include sensitivity analysis (testing assumptions) and stranded-capacity risk (underutilized infrastructure). These models help CFOs and boards weigh investment decisions with a clear line to financial outcomes.

Secure power early.

Lock in long-term power purchase agreements, as hyperscalers do. Recent billion-dollar deals highlight power as a competitive advantage.

Plan the cooling transition.

Most operators still use air cooling, but rising densities make liquid cooling inevitable, though adoption remains gradual. Pilot liquid-ready designs where density is highest and ensure facilities can scale.

Design for scale.

Align clusters with modular topologies and interconnects supporting RDMA at 100-400 Gbps. Keep hot data close to compute and ensure high-throughput ingest for retraining.

Site with governance in mind.

Data sovereignty and power availability are top factors. Build compliance in from the start with frameworks like the NIST AI Risk Management Framework and EU DORA requirements.

De-risk GPU procurement.

Secure multi-year reservations where possible and diversify with fractional access or managed services. This reduces idle capacity and exposure to supply shocks.

Adopt a hybrid deployment.

Keep large-scale training where power and cooling are most efficient, but push inference closer to users. Edge nodes meet response-time goals while controlling costs.

Measure business outcomes, not just floating point operations per second.

Track cost per token, deployment speed, failure rate and emissions alongside product KPIs. These metrics connect infrastructure choices to tangible business performance.

Importance Of Infrastructure

AI is not just a software race. It is an infrastructure race. Analysts forecast AI infrastructure spending could exceed $200 billion by 2028 and reach into the trillions by 2030. This is where I believe the future of intelligence will be built.

Dominic Wilde

Dom is an incisive business leader and creative product strategist with 30-plus years of experience in the cloud and data center industries.

Mirantis simplifies cloud native development.

From the leading container engine for Windows and Linux to fully managed services and training, we can help you at every step of your cloud native journey.

Connect with a Mirantis expert to learn how we can help you.

CONTACT US
cloud-native-callout-bg