Proxmox Clustering: Everything You Need to Know
A comprehensive overview of Proxmox VE clustering covering benefits, requirements, corosync internals, the pmxcfs distributed filesystem, and when you should avoid clustering.
What Is a Proxmox Cluster?
A Proxmox VE cluster is a group of physical servers (nodes) that are joined together and managed as a single entity. Under the hood, Proxmox uses corosync for cluster communication and pmxcfs (Proxmox Cluster File System) to keep configuration files synchronized across every node. Once clustered, you can manage all nodes from any single web interface, migrate VMs between hosts, and enable high availability for critical workloads.
Key Benefits of Clustering
- Centralized management – Access all nodes, VMs, and containers from one web UI or API endpoint. Tools like ProxmoxR make this even easier by letting you monitor your entire cluster from your phone.
- Live migration – Move running VMs between nodes with zero downtime using shared storage or local-to-local migration.
- High availability (HA) – If a node fails, the HA manager automatically restarts VMs on surviving nodes.
- Shared configuration – User accounts, storage definitions, firewall rules, and ACLs are replicated to all nodes automatically.
- Distributed storage – Deploy Ceph across cluster nodes for redundant, high-performance storage without external hardware.
Cluster Requirements
Before creating a cluster, verify your environment meets these prerequisites:
- Odd number of nodes (3+) – Quorum requires a majority vote. With two nodes, losing one means losing quorum (you can work around this, but it is not ideal).
- Same Proxmox VE major version – All nodes must run the same release (e.g., all PVE 8.x).
- Low-latency network – Corosync needs round-trip latency below 2 ms. Nodes in different data centers connected over a WAN are not supported.
- Dedicated cluster network – Separate corosync traffic from VM and storage traffic to avoid contention.
- Static IPs and working DNS/hosts resolution between all nodes.
How Corosync Works
Corosync is the messaging layer that keeps cluster nodes in sync. It uses the Totem Single-Ring Ordering and Membership Protocol to guarantee that all nodes receive messages in the same order. Every node sends periodic heartbeats; if a node misses enough heartbeats, it is declared offline and removed from the quorum.
You can inspect the current corosync configuration on any node:
# View the corosync config
cat /etc/corosync/corosync.conf
# Check cluster membership in real time
corosync-cfgtool -s
# See the quorum status
pvecm status
The output of pvecm status shows the quorum state, the expected votes, and the total votes currently available. If total votes drop below the majority, the cluster loses quorum and will refuse to start or migrate VMs (a safety measure to prevent split-brain scenarios).
The pmxcfs Distributed Filesystem
Proxmox mounts a small FUSE filesystem at /etc/pve. This is pmxcfs, and it stores all cluster-wide configuration: VM configs, user databases, firewall rules, storage definitions, datacenter settings, and more. Changes made on any node are replicated to all other nodes via corosync within milliseconds.
# List cluster-wide config files
ls /etc/pve/nodes/
# View a VM's config (stored in pmxcfs)
cat /etc/pve/nodes/pve1/qemu-server/100.conf
# Check pmxcfs status
pmxcfs -l
Because pmxcfs relies on corosync, it becomes read-only when quorum is lost. This prevents conflicting changes from being written when the cluster is in a degraded state.
Creating and Joining a Cluster
Setting up a cluster takes just a few commands. On the first node, create the cluster:
# On the first node, create the cluster
pvecm create my-cluster
# On each additional node, join the cluster
pvecm add 192.168.1.10
After joining, verify that all nodes are visible:
# List all cluster nodes
pvecm nodes
# Expected output:
# Node Sts Inc Joined Name
# 1 M 108 2026-02-28 10:00:00 pve1
# 2 M 112 2026-02-28 10:05:00 pve2
# 3 M 116 2026-02-28 10:10:00 pve3
When NOT to Cluster
Clustering is not always the right choice. Avoid creating a cluster if:
- Your nodes are geographically separated with more than 2 ms latency. Corosync is not designed for WAN links, and timeouts will cause constant node fencing.
- You only have one server. A single-node “cluster” adds complexity with no benefit.
- You have exactly two nodes and no quorum device. Losing one node means losing quorum, which locks the remaining node. Use a QDevice (corosync-qdevice) or accept manual intervention.
- The nodes serve completely independent purposes. If you never need to migrate VMs between them or share configuration, managing them separately is simpler.
- Your network is unreliable. Frequent packet loss or jitter will trigger false fencing events, causing VMs to restart unexpectedly.
Monitoring Your Cluster
Once your cluster is running, regular monitoring prevents surprises. Check quorum health, corosync ring status, and node availability:
# Quick cluster health check
pvecm status
pvecm expected 1 # DANGER: only use in emergency to force quorum
# Monitor corosync rings
corosync-cfgtool -s
# Watch cluster logs
journalctl -u corosync -f
For on-the-go monitoring, mobile apps like ProxmoxR let you check node status, receive alerts, and even trigger migrations directly from your phone — which is invaluable when you are away from your desk and something goes wrong at 2 AM.
Summary
Proxmox clustering transforms a collection of independent servers into a resilient, centrally managed infrastructure. The combination of corosync for reliable messaging and pmxcfs for distributed configuration makes it straightforward to set up and maintain. Just make sure your network meets the latency and bandwidth requirements, use an odd number of nodes for proper quorum, and keep your nodes on the same Proxmox VE version. With those fundamentals in place, you will have a solid foundation for live migration, high availability, and scalable storage.
Take Proxmox management mobile
All the features discussed in this guide — accessible from your phone with ProxmoxR. Real-time monitoring, power control, firewall management, and more.