Tech Expert & Vibe Coder

With 15+ years of experience, I specialize in self-hosting, AI automation, and Vibe Coding – building applications using AI-powered tools like Google Antigravity, Dyad, and Cline. From homelabs to enterprise solutions.

Fixing Docker Bridge Network MTU Mismatches in Nested Virtualization: Resolving Packet Loss Between Proxmox VMs and Container Networks

# Fixing Docker Bridge Network MTU Mismatches in Nested Virtualization: Resolving Packet Loss Between Proxmox VMs and Container Networks

Why I Had to Fix This

I run Proxmox as my main virtualization platform, and inside one of my VMs, I host multiple Docker containers. For months, everything worked fine—until I started noticing strange packet loss patterns. Large file transfers would stall. API calls to certain containers would timeout randomly. Small requests worked perfectly, but anything over a certain size would just… fail.

The weird part? It only happened when communicating between containers on different VMs, or when traffic went from my physical network into a container. Local container-to-container communication on the same Docker host was flawless.

This wasn’t a DNS issue. It wasn’t a firewall rule. It was something lower in the network stack, and it took me longer than I’d like to admit to figure out it was an MTU mismatch.

My Setup and Where the Problem Showed Up

Here’s what I was running:

  • Proxmox VE 8.x on bare metal
  • Multiple Ubuntu VMs (22.04 LTS) running Docker
  • Docker using the default bridge network (docker0)
  • VMs connected to a Proxmox bridge (vmbr0) with VirtIO network adapters
  • Physical network MTU set to 1500 (standard Ethernet)

The Docker bridge inside each VM was also set to 1500 by default. On paper, everything matched. But the reality was messier.

When I ran ip link show on the Proxmox host, vmbr0 showed MTU 1500. Inside the VM, the main interface (ens18) also showed 1500. But when I checked the actual effective MTU for packets traveling through the nested layers, things didn’t line up.

The problem was this: Proxmox adds VLAN tags and some encapsulation overhead depending on your network setup. Even though the interfaces reported 1500, the actual usable payload size for packets traveling from a container, through Docker’s bridge, through the VM’s network stack, through Proxmox’s bridge, and out to the physical network was smaller.

Large packets were being fragmented or dropped silently. TCP would eventually retransmit, but UDP-based services and certain APIs just timed out.

How I Confirmed It Was an MTU Issue

I used ping with the “Don’t Fragment” flag to test the actual path MTU:

ping -M do -s 1472 target-ip

This sends a packet of 1472 bytes plus 28 bytes of IP and ICMP headers, totaling 1500 bytes—the standard MTU. If this worked, I tried larger sizes:

ping -M do -s 1473 target-ip

On my setup, anything above 1472 bytes failed with “Frag needed” errors when going from a container to an external host. That told me the effective MTU was lower than 1500 somewhere in the path.

I also checked Docker’s bridge MTU directly:

ip link show docker0

It showed 1500. But the VM’s main interface and Proxmox’s bridge were also 1500, so the problem wasn’t obvious from interface settings alone.

What I Changed to Fix It

I lowered the Docker bridge MTU to 1450. This gave enough headroom for the encapsulation overhead introduced by Proxmox’s virtualization layer and any VLAN tagging.

Here’s how I did it:

Option 1: Temporary Fix (For Testing)

On the VM running Docker, I manually set the MTU:

sudo ip link set docker0 mtu 1450

This worked immediately but wouldn’t survive a reboot or Docker restart.

Option 2: Permanent Fix (What I Actually Use)

I edited Docker’s daemon configuration at /etc/docker/daemon.json:

{
  "mtu": 1450
}

Then restarted Docker:

sudo systemctl restart docker

After this, all new containers inherited the 1450 MTU. Existing containers needed to be recreated (not just restarted) to pick up the new setting. I use Docker Compose for most services, so I just ran:

docker-compose down
docker-compose up -d

That forced container recreation with the new MTU.

Why 1450 and Not Something Else

I didn’t calculate this scientifically. I started at 1450 because it’s a common safe value when you’re unsure about encapsulation overhead. It worked, so I stopped there.

Could I have used 1460 or 1480? Probably. But I wasn’t trying to squeeze every byte of efficiency out of the network. I just needed reliable packet delivery, and 1450 gave me that without any noticeable performance hit.

If you’re running jumbo frames (MTU 9000) on your physical network, the calculation changes. But my network is standard Ethernet, so 1450 was the right compromise.

What Didn’t Work (Or What I Tried First)

Before landing on the MTU fix, I wasted time on:

  • Firewall rules: I thought maybe Proxmox’s firewall or the VM’s iptables were dropping packets. They weren’t. The rules were fine.
  • DNS issues: I checked /etc/resolv.conf and DNS resolution. Everything resolved correctly. This wasn’t a name resolution problem.
  • Docker network drivers: I briefly considered switching from bridge to macvlan, but that would have introduced more complexity without solving the root cause.
  • Proxmox bridge settings: I looked at vmbr0 configuration, but changing it would have affected all VMs, and the problem was specific to Docker containers.

The real clue came when I noticed the packet loss was size-dependent. Small requests always worked. That pattern screams MTU mismatch.

How to Check If This Is Your Problem

If you’re seeing similar issues, here’s how to confirm:

  1. Run the ping test with the “Don’t Fragment” flag from inside a container to an external host.
  2. Check if large HTTP requests fail but small ones succeed.
  3. Look for retransmissions in netstat -s or ss -ti.
  4. Use tcpdump on both the VM and the container to see if packets are being fragmented or dropped.

If you see fragmentation or “Frag needed” errors, you have an MTU problem.

Key Takeaways

MTU mismatches are subtle. Interfaces can all report the same MTU, but encapsulation overhead in nested virtualization (Proxmox + Docker) can make the effective MTU smaller.

Lowering Docker’s MTU to 1450 solved my packet loss issues without any performance trade-offs I could measure. It’s a safe default when you’re running containers inside VMs.

The fix is simple, but diagnosing it took time because the symptoms were inconsistent. Small packets worked fine, which made it easy to miss the pattern.

If you’re running Docker on Proxmox VMs and seeing random timeouts or stalled transfers, check your MTU before diving into firewall rules or DNS settings.

Leave a Comment

Your email address will not be published. Required fields are marked *