< BLOG HOME

From AI Clouds to Air-gapped Data Centers—Mirantis OpenStack for Kubernetes 25.2 Powers the Future of Private Cloud

MOSK 25.2

We’re excited to introduce Mirantis OpenStack for Kubernetes (MOSK) 25.2, the latest update to our Kubernetes-native OpenStack distribution. This release brings major advances in networking, storage, and operational resilience, while keeping MOSK aligned with upstream OpenStack and Kubernetes innovation. From support for the new Epoxy release to enhanced bare-metal features for AI workloads, air-gapped operations, and smarter observability, MOSK 25.2 equips enterprises and telcos to run modern private clouds with more scale, flexibility, and confidence than ever before

Major component updates

MOSK 25.2 introduces OpenStack 2025.1 “Epoxy” for both new deployments and upgrades from 2024.1 (Caracal), with support across OVS, OpenSDN (Tungsten Fabric), and Open Virtual Network (OVN) backends for networking, while OpenStack Antelope is no longer supported.

Networking moves forward with OpenSDN 24.1, the first release under its new name, now supported alongside Caracal and Epoxy. This version modernizes the codebase, expands IPv6 capabilities, and drops legacy Analytics services for a lighter footprint. At the same time, OVN 24.03 brings bug fixes, performance enhancements, and security improvements, and MOSK now includes a validated migration path for moving away from OVS.

On the storage side, Ceph 19.2 “Squid” replaces Reef (18.2), which reaches end of life in 2025. Along with the new features from Squid, the update aligns dependencies with Rook 1.15/1.16 and CephCSI 3.12 (with csi-provisioner 5.0.1 where supported).

The underlying platform also steps up: Ubuntu 24.04 with kernel 6.8 LTS is now the base OS for management clusters, with support for MOSK clusters planned for introduction early next year. MOSK also now runs on MKE 3.7.mosk.x with Kubernetes 1.30—a special flavor of MKE 3.7 created specifically for MOSK that will continue to receive security updates beyond the end of life of standard MKE 3.7.

Air-gapped clusters

MOSK 25.2 makes air-gapped deployments and updates a first-class, fully supported scenario. In environments where Internet access is prohibited—whether for regulatory, security, or operational reasons—operators can run MOSK without compromise while keeping clusters aligned with upstream over time.

The design centers on a local artifact mirror that hosts all required container images, OS packages, and charts. Mirantis tooling fetches and verifies artifacts outside the restricted network, then populates the mirror; lifecycle management parameters point both management and workload clusters to this mirror for bootstrap, scaling, and updates. Telemetry adapts to offline mode so metrics are retained locally without generating false connectivity alerts.

This provides a repeatable, documented path to operate MOSK entirely offline—meeting the needs of finance, government/defense, and other security-sensitive sectors where every artifact must be scanned and approved before entering the datacenter.

Full L3 networking on bare metal

MOSK 25.2 introduces full L3 networking on bare metal as a technical preview, addressing long-standing challenges with scaling multi-rack environments. Traditionally, operators have had to stretch VLANs across racks to deliver consistent network connectivity for Kubernetes services and control planes. This design often leads to fragile dependencies and limits how data centers can be structured as environments grow.

With this new approach, each rack can operate as an isolated L2 domain, while external IPs such as the Kubernetes API VIP and MetalLB service IPs are announced via BGP. By shifting service advertisement to L3, operators can eliminate cross-rack VLAN stretching, simplify network architecture, and align MOSK clusters with modern datacenter designs that favor routed rather than bridged topologies.

The change makes MOSK better suited for large-scale bare metal deployments where resiliency and manageability are key. By supporting L3 networking at the cluster level, MOSK enables operators to scale out across racks without complex VLAN coordination, reducing risk of misconfiguration and improving long-term operability in demanding environments

Advanced Bare Metal aaS 

Training large language models consumes so much GPU, CPU, and storage bandwidth that it often makes more sense to allocate entire bare-metal servers instead of virtualized instances. At the same time, operators still need the flexibility of OpenStack to run VMs and bare-metal machines side by side in a single project and wire them into complex network topologies. MOSK 25.2 introduces features to make Bare Metal service (OpenStack Ironic) both technically robust and easier to operate.

Remote serial console for bare metal machines. Exposes bare-metal nodes through the OpenStack CLI, allowing direct console access when guest networking fails. This makes it possible to debug and recover GPU servers without reinstalling, which is critical in long-running AI workloads.

Port trunking for bare metal machines in clusters with OVN networking. Lets bare-metal machines attach to multiple tenant networks via a single LACP bond with VLAN tagging. This allows management, storage, and training traffic to be isolated yet delivered over high-bandwidth links, providing the network flexibility that AI/LLM training clusters require.

Network infrastructure monitoring

MOSK 25.2 expands observability with network infrastructure monitoring integrated into Mirantis StackLight. At its core is a lightweight “net checker” that continuously verifies connectivity across critical network paths, feeding metrics, alerts, and dashboards into the monitoring stack. Operators can enable or disable the feature through MOSK management, ensuring flexibility in how it is applied across different environments.

This capability gives operators visibility into issues that traditional host- or service-level metrics cannot catch. By actively probing connectivity, it helps detect misconfigurations, routing errors, or switch-level problems before they cascade into outages. When alerts fire, detailed metrics are already available in StackLight dashboards, reducing mean time to diagnose and resolve.

The feature is especially valuable in environments where network changes are frequent or externally managed. Past incidents—such as firmware upgrades on top-of-rack switches causing Ceph service interruptions, or recurring misconfigurations in large-scale deployments—highlight how fragile network dependencies can be. With MOSK 25.2, operators gain an early warning system that surfaces these failures quickly, helping them prevent or minimize downtime.

Operational excellence

MOSK 25.2 continues to raise the bar for cloud operations with improvements that simplify day-to-day tasks and reduce risk. Updates are now managed exclusively through the ClusterUpdatePlan API and the new cluster update interface in the MOSK management console, providing a consistent, guided workflow while retiring the older, monolithic update path.

Cluster stability during maintenance is further improved with graceful shutdown of Kubernetes nodes enabled by default. Workloads and system services are drained cleanly before nodes go offline, lowering the chance of inconsistency or disruption.

Bare-metal operations are more reliable thanks to a reworked node deployment workflow: during MOSK cluster initial deployment or when adding new servers to an existing cluster, operators can inspect and validate actual hardware before software installation proceeds, reducing failed attempts and rework.

Operators can now use new fields in the Ceph life cycle management API, MiraCeph, (deviceFilter and devicePathFilter) to describe which disks should be used across many servers, rather than listing them node by node. This streamlines cluster growth and hardware swaps, reduces errors in large environments, and makes day-2 operations more predictable

Finally, the BareMetalHostInventory API is fully represented in the MOSK management console, superseding BareMetalHost, which now serves as an internal interface. This aligns the operator experience with supported public APIs going forward. 

Security and safety

MOSK 25.2 strengthens resilience and compliance with a set of focused enhancements. Operators can now configure OpenSDN database backups to S3 endpoints, ensuring SDN state is stored securely off-cloud in the same way as OpenStack DB backups. This makes it easier to align with enterprise backup policies and meet regulatory requirements.

Routine lifecycle tasks are also simpler: LBaaS (OpenStack Octavia) certificate rotation is now automated to reduce manual effort and minimize the chance of misconfiguration or service disruption.

Finally, the results of the CIS benchmark for the Ubuntu host OS are published in the MOSK Security Guide, giving operators and auditors a transparent view of compliance out of the box and helping plan remediation where needed

Monitoring and alerting

MOSK 25.2 expands observability so operators can detect issues earlier and respond with confidence. A new OpenStack Horizon availability probe simulates real user actions against the OpenStack dashboard, raising alerts when authentication or navigation fails — so problems surface before they impact users.

Several new alerts improve resilience in production. MOSK now warns when /var/lib/nova is remounted read-only, giving teams a chance to evacuate workloads before failures cascade. It also tracks AMD EPYC server uptime against erratum 1474, prompting action well before stability issues occur.

Monitoring coverage itself has grown: MOSK’s central IAM system, Keycloak, is now directly monitored with blackbox probes, dashboards, and alerts, ensuring authentication services are visible end-to-end. Node Exporter has been hardened with fewer false positives and new configuration options, making host telemetry more adaptable to large-scale environments.

Finally, MOSK simplifies the stack by retiring legacy components. The RabbitMQ exporter has been fully replaced by the native Prometheus plugin, Ceph alerts have refined severities and troubleshooting steps, and obsolete telegraf-openstack metrics have been removed from dashboards. These changes streamline monitoring while keeping focus on actionable signals.

Conclusion

With MOSK 25.2, Mirantis continues its mission to make private clouds powerful, secure, and future-proof. The update delivers improvements across the stack: networking evolves with L3 bare-metal support and updated SDN components, storage moves forward with Ceph Squid, operations become smoother with new update flows and air-gapped capabilities, and observability expands with deeper monitoring and alerting. Together, these changes help operators simplify complexity and support demanding workloads—from enterprise applications to AI training clusters—on a platform they can trust.

To learn more about MOSK 25.2, please see the release notes.

Artem Andreev

Artem Andreev is a Senior Engineering Manager at Mirantis

Mirantis simplifies Kubernetes.

From the world’s most popular Kubernetes IDE to fully managed services and training, we can help you at every step of your K8s journey.

Connect with a Mirantis expert to learn how we can help you.

CONTACT US
k8s-callout-bg.png