Kubernetes on Bare Metal or VMs?

Preface

Lately, I’ve been asked the same question by an international bank in London, a trading firm in Switzerland, and a startup in Finland: should Kubernetes live on bare metal or inside virtual machines? Each is finding its own way through the jungle of infrastructure modernization, and all have ended up asking versions of the same thing.

After repeating these conversations enough times to start hearing echoes, I figured it might be useful to share my perspective more widely. What follows is a mix of real-world experience, a few hard-earned lessons, and a desire to keep things grounded.

Kubernetes on Bare Metal or VMs? It Depends.

Would you install Kubernetes on VMs or on bare metal?
That’s a question I’m hearing more lately, and honestly, it’s about time.

Because while Kubernetes used to be the new kid on the virtualization block, it’s grown into a central piece of infrastructure, and people are finally starting to ask how and where it should really live.

As always, the answer is: “It depends.”

Let’s take a step back.

While I appreciate Kubernetes for what it offers, I won’t pretend it’s always my favourite tool. It often complicates things that used to be simple, and troubleshooting certain Kubernetes applications is enough to make anyone nostalgic for top, netstat and strace/truss.

But we can’t ignore reality: the industry now treats Kubernetes like an operating system for distributed applications. Helm and containers have become the default packaging and delivery format. So if you’re building or modernising infrastructure, this shift must be considered.

From VMs to Containers (and Back Again)

Historically, most workloads have run as virtual machines under VMware, Proxmox, OpenStack, or the cloud flavour of the month. Kubernetes clusters were often deployed within VMs because VMs were the common currency of infrastructure.

However, we are now seeing a role reversal: Kubernetes is hosting more workloads directly, even virtual machines, via projects like KubeVirt. In many orgs, containers outnumber VMs, and platforms like RedHat OpenShift and SUSE Harvester reflect this shift, blurring the line between VM and container platforms.

Vendors are increasingly aligned on pushing Kubernetes onto bare metal, positioning it as the next-generation abstraction layer, not just for containers, but as a full replacement for traditional virtualization platforms. Based on my observations, this shift is largely vendor-driven, rather than pulled by customer demand, and reflects a broader effort to consolidate infrastructure stacks around Kubernetes as a universal control plane.

However, while solutions like KubeVirt are relatively mature, the surrounding ecosystem, particularly container networking, hasn’t fully caught up. Projects like OVN-Kubernetes are still evolving, and many deployments rely on Multus with basic VLAN or bridge setups. These are functional but fall short of what users expect from platforms like VMware or even OpenStack Neutron. That disconnect risks repeating the same mismatch that once existed between OpenStack and VMware, especially as many organizations are still operating with a “VM-first” mindset.

So… Bare Metal or VMs?

Like I said, it depends. And that’s not just on the tech, but on organisational reality.

Kubernetes still lacks true multi-tenancy (although that’s evolving). While Kubernetes offers some multitenancy primitives (like namespaces and RBAC), it still lacks strong, first-class tenant isolation, which is why many enterprises prefer separate clusters per team or org, or even for security or compliance requirements.

In large enterprises, such as those in finance or telecommunications, every department typically wants its own cluster. Running dozens of clusters on bare metal can work, but only if you have full automation in place.

Another factor that many overlook is the political layer, and I don’t mean bureaucracy. I mean realities such as long-term storage vendor contracts, licensing terms, or legacy network setups that the new platform will have to coexist with. These agreements often dictate whether you can move fast with bare metal or need to build on top of existing virtualised infrastructure to reuse what’s already in place, especially storage systems and networking contracts.

At one large telco where I worked, we opted to run Kubernetes on top of OpenStack. Each business unit had its own OpenStack tenant and dedicated Kubernetes clusters, which were deployed automatically using SUSE Rancher. It was a clean separation that matched the organisational structure and supported real production use cases, such as the containerised 5G control plane.

Yes, that means the mobile calls you make in the UK might be routed through a Kubernetes cluster running on top of OpenStack. It’s that real.

When Bare Metal is the Right Answer

Bare metal isn’t glamorous for many and it’s not sexy (but it is for me!). Yet when workloads push the limits, performance, GPU access, latency, it proves to be the solid choice. And let’s be honest: some workloads demand bare metal, especially GPUs and VMs, there’s no way around it.

For instance:

KubeVirt (VMs in Kubernetes) requires access to VT-x extensions, which are only available on bare metal.
GPUs (for ML or GenAI) require vGPU slicing, PCI passthrough, SR-IOV for max network throughput, careful planning, and again, are best suited for bare metal.

Sure, you can technically run Kubernetes in a VM and use nested virtualisation or GPU passthrough. But in real-world production? It’s a performance and management headache.

I’m currently advising on a European research cluster that plans to deploy bare-metal Kubernetes with GPU nodes, precisely for these reasons.

One thing to keep in mind when running Kubernetes on bare metal is that the more direct access to resources comes with more responsibility. As pod density increases (say, beyond 250–300 per node), you may need to tune system parameters like conntrack limits, inotify watchers, and kube-proxy mode. These aren’t usually problems in managed or VM-based clusters because the underlying platform shields you, but in bare metal setups, tuning makes the difference between high performance and strange, hard-to-debug edge cases.

Why VMs Still Have a Role

While I already expressed my unconditional love for bare metal, virtual machines still have their charm, particularly as Kubernetes matures but still lacks true multi-tenancy. They can offer practical isolation, especially when multiple internal teams want their own clusters but shared physical infrastructure.

This is a less-discussed topic in many Kubernetes architecture conversations. Organizational fragmentation, where different departments or business units want their own isolated cluster, often makes a VM-based approach more practical. Rather than spinning up multiple bare metal Kubernetes clusters, which can become operationally burdensome, deploying Kubernetes inside VMs allows teams to carve out dedicated environments on a shared substrate.

Of course, isolation and failure domains must be thoughtfully planned, for example, using anti-affinity rules to ensure Kubernetes worker VMs don’t end up concentrated on the same hypervisor. Still, VMs tend to offer better manageability for general-purpose workloads compared to bare metal.

Another significant advantage of using VMs is the ability to hard contain Kubernetes workers. On bare metal, Kubernetes enforces resource limits through cgroups and quotas, but these are ultimately soft constraints. A pod that misbehaves can still impact the host under certain conditions. With VMs, you can cap CPU and memory at the hypervisor level, using hardware-enforced isolation to create strict, predictable boundaries. This can be critical in shared environments or multi-tenant setups where noisy neighbors and workload sprawl must be tightly controlled.

And this doesn’t necessarily require a full-blown cloud platform like OpenStack. Virtual machines can be provisioned on top of a simpler virtualization stack such as plain KVM (ex libvirt/virsh), orchestrated with tools like MAAS, Ansible, or others, depending on your environment. Platforms like Proxmox are gaining momentum as lightweight alternatives, especially as organizations look beyond VMware. While it lacks native multi-tenancy, it’s a solid option for teams looking to automate virtual infrastructure without the full complexity of OpenStack. That said, cloud platforms do simplify many aspects of lifecycle and tenant orchestration, particularly in environments that rely on Terraform/OpenTofu or other declarative “Infrastructure as a Code” tools.

It’s worth noting that most hyperscalers and cloud providers offer Kubernetes on top of virtual machines, but not necessarily because it’s technically superior. Just like enterprises managing multiple internal teams, providers choose this model because it’s more manageable: VMs are easier to schedule, isolate, migrate, and bill for. That’s why bare metal is typically reserved for niche cases like GPUs, SR-IOV, or direct hardware access.

But that doesn’t mean Kubernetes-on-VMs is always the right choice, just that it’s the more comfortable one for providers. For those managing infrastructure directly, there’s more room to choose the right tool based on workload needs, team skills, and operational context.

Don’t Forget Storage & Networking

Everyone obsesses over CPU and memory, but storage (CSI) and networking (CNI) are the unsung heroes. Things can get complicated in that space, especially when mixing bare metal and VMs.

In our telco example, we used OpenStack Cinder for storage and Neutron (via Kuryr) for pod networking. It worked well inside the OpenStack envelope.
On bare metal, you have more choices, like Ceph, NFS, iSCSI, but you also have to handle more integration work.

You can also mix and match: have some Kubernetes nodes on bare metal (for GPU workloads), others virtualised, using node pools and labels to direct workloads. It’s doable, but it takes planning and discipline. We were prototyping this approach using MaaS and Ansible for bare-metal provisioning.

Tech is “Easy”, but People Are Harder

Let me spill a secret: infrastructure itself is rarely the villain. Here’s what often breaks projects in my experience: people don’t understand the platform they’re deploying on.

I’ve seen countless teams try to treat containers like VMs. Asking for ReadWriteMany volumes because they want to share storage between multiple containers, even when using OpenStack Cinder, which only supports ReadWriteOnce (because it’s block storage). That’s not cloud-native, that’s legacy architecture wrapped in a container.

I once had to explain why a custom third-party PHP app that used shared directories across containers couldn’t be deployed as-is. The underlying architecture was blamed for not being able to accommodate everything, but the real issue was the architectural and cultural mismatches.

Which is why success doesn’t come just from infrastructure working. It comes from training, setting expectations, and defining boundaries. Developers (both internal and external) must understand the context into which they’re deploying.

No Buzzfeed Quiz Here

If you came looking for a quiz like, “Should you deploy Kubernetes on bare metal or VMs? Take this test to find out what kind of cluster you are!”… well, I’m afraid you’re out of luck. I’m not that kind of consultant.

Every decision I make is about balancing reality: the tech stack, the team’s skills, procurement constraints, workload profiles, and the organizational shape of your business. There’s no algorithm, no magic checklist, no cookie-cutter answer. Just real-world trade-offs and choices that make sense for your situation.

After decades of wrangling infrastructure, from telco 5G cores to banking clusters, I can tell you this: there’s no one-size-fits-all. But maybe reading about the choices, the pitfalls, and the occasional absurdity of enterprise life will help you frame your own thinking… and perhaps avoid a few headaches along the way.

2025-08-29