Proxmox Cluster Split-Brain: Causes, Prevention, and Recovery

Learn what causes split-brain in Proxmox clusters, how fencing and network redundancy prevent it, and step-by-step instructions for manual recovery.

What Is Split-Brain?

Split-brain occurs when nodes in a Proxmox VE cluster lose communication with each other but remain individually operational. Each partition believes the other nodes have failed and may attempt to take ownership of shared resources. This can lead to multiple nodes running the same VM simultaneously, causing data corruption on shared storage, conflicting configuration changes, and a cluster that is extremely difficult to recover.

Split-brain is one of the most dangerous failure modes in any clustered system. Understanding its causes and prevention strategies is essential for running a reliable Proxmox infrastructure.

What Causes Split-Brain

The most common causes are network-related:

Single network link failure between cluster nodes
Switch failure on the cluster communication network
Firewall misconfiguration blocking corosync ports
Network congestion causing corosync timeouts
Incorrect expected_votes override allowing both partitions to claim quorum

# Identify if your cluster is in split-brain
pvecm status

# If you see "Quorate: No" on some nodes but "Quorate: Yes" on others,
# you likely have a partition problem

# Check corosync membership on each node
corosync-cfgtool -s

# Look for nodes with "status: disconnected"

How Quorum Prevents Split-Brain

The quorum system is the primary defense against split-brain. When a cluster partitions, only the partition with more than half the votes (quorum) continues operating. The minority partition stops all operations and refuses to modify shared state.

# In a 3-node cluster, quorum requires 2 votes
# If network splits: 2 nodes vs 1 node
# - The 2-node partition has quorum -> continues operating
# - The 1-node partition loses quorum -> stops

# Check expected votes and current quorum
pvecm status | grep -E "Expected|Quorum|Quorate"

# Dangerous: never force expected_votes lower to "fix" a partition
# This defeats quorum protection and can cause split-brain
# BAD: pvecm expected 1  (on a 3-node cluster)

Fencing: The Last Line of Defense

Fencing (also called STONITH - Shoot The Other Node In The Head) is a mechanism that forcibly shuts down a node when it cannot be confirmed as healthy. This prevents a partitioned node from accessing shared storage.

# Proxmox HA manager uses fencing for HA-managed VMs
# When a node is unreachable, HA will:
# 1. Wait for the fencing timeout
# 2. Attempt to fence (power off) the failed node
# 3. Restart HA VMs on surviving nodes

# Check HA manager status and fencing events
ha-manager status

# View fencing logs
journalctl -u pve-ha-lrm -n 50
journalctl -u pve-ha-crm -n 50

# Configure hardware watchdog for reliable self-fencing
# This ensures a node reboots itself if it loses cluster contact
apt install watchdog
# Edit /etc/default/watchdog to enable
# watchdog_module="softdog"

Network Redundancy to Prevent Partitions

The best way to prevent split-brain is to ensure cluster communication never fails. Use redundant network paths between nodes.

# Option 1: Network bonding (LACP)
# Configure bonded interfaces for the cluster network
# In /etc/network/interfaces:
auto bond0
iface bond0 inet static
    address 10.10.10.1/24
    bond-slaves ens18 ens19
    bond-mode 802.3ad
    bond-miimon 100

# Option 2: Multiple corosync links (knet supports this natively)
# In corosync.conf, define multiple links per node:
# node {
#     name: pve1
#     nodeid: 1
#     ring0_addr: 10.10.10.1
#     ring1_addr: 10.20.20.1
# }

# Verify both links are active
corosync-cfgtool -s
# Link 0: connected
# Link 1: connected

Using dual corosync links on separate switches or VLANs provides the most robust protection against network-induced split-brain.

Manual Split-Brain Recovery

If split-brain has already occurred, recovery requires careful manual intervention. Tools like ProxmoxR can help you identify which nodes have diverged, but the actual recovery must be done on the nodes themselves.

# Step 1: Identify the authoritative partition
# Check which nodes have the most recent, correct configuration
# Compare timestamps on both sides:
ls -la /etc/pve/nodes/*/qemu-server/
ls -la /etc/pve/nodes/*/lxc/

# Step 2: Shut down all VMs on the minority partition
qm stop 100
qm stop 101

# Step 3: Stop cluster services on the minority partition
systemctl stop pve-ha-lrm
systemctl stop pve-ha-crm
systemctl stop pve-cluster
systemctl stop corosync

# Step 4: Verify the majority partition has quorum
pvecm status  # Run on majority side

# Step 5: Force the minority node to rejoin
# On the minority node:
pmxcfs -l
rm -rf /etc/corosync/corosync.conf
killall pmxcfs

# Copy config from the authoritative node
scp pve1:/etc/pve/corosync.conf /etc/corosync/corosync.conf

systemctl start pve-cluster
systemctl start corosync

Dealing with pmxcfs FUSE Issues

The Proxmox cluster filesystem (pmxcfs) uses FUSE to mount /etc/pve. During split-brain or after unclean recovery, pmxcfs can get stuck.

# Check if /etc/pve is mounted and responsive
ls /etc/pve/

# If it hangs, pmxcfs may be stuck
# Check pmxcfs status
systemctl status pve-cluster

# Force unmount and restart
fusermount -u /etc/pve
systemctl restart pve-cluster

# If that fails, a more aggressive approach
killall -9 pmxcfs
systemctl start pve-cluster

# Verify the filesystem is working again
ls /etc/pve/nodes/
cat /etc/pve/corosync.conf

Prevention Checklist

Follow these practices to minimize split-brain risk in your Proxmox clusters:

Use odd numbers of nodes (3, 5, 7) for natural quorum majority
Configure redundant corosync links on separate network paths
Use a QDevice for two-node and even-numbered clusters
Never override expected_votes unless you fully understand the implications
Enable hardware watchdog for self-fencing capability
Monitor cluster health continuously and alert on quorum degradation
Test failover scenarios regularly in a maintenance window

Split-brain is preventable with proper network design and quorum configuration. Invest in redundant cluster networking upfront to avoid the painful recovery process later.

Proxmox Cluster Split-Brain: Causes, Prevention, and Recovery

What Is Split-Brain?

What Causes Split-Brain

How Quorum Prevents Split-Brain

Fencing: The Last Line of Defense

Network Redundancy to Prevent Partitions

Manual Split-Brain Recovery

Dealing with pmxcfs FUSE Issues

Prevention Checklist

Take Proxmox management mobile

Related Articles

How to Set Up a Proxmox Cluster: Complete Guide

Proxmox High Availability (HA) Configuration Guide

How to Migrate VMs Between Proxmox Nodes

Manage Proxmox from your phone