Debugging nftables conntrack table exhaustion when running high-throughput Traefik reverse proxy with thousands of concurrent Docker container connections

Why I worked on this

I run a single-node Proxmox box at home that hosts about 40 Docker containers. Most are hobby stuff, but one stack is a Traefik reverse proxy that fronts a handful of public services. One night I noticed the logs were full of 502s and sporadic “no route to host” errors. Inside the containers everything looked fine—health checks passed, CPU was idle, memory usage normal. The failures felt random: one request would work, the next would timeout, then three more would succeed. After an hour of tailing logs I SSH’d to the host, ran dmesg, and saw the line:

nf_conntrack: table full, dropping packet

I hadn’t touched the firewall in months, so I didn’t expect nftables to be the culprit. But once I saw that message, the weird symptoms made sense: the kernel was silently discarding new connections before Traefik or the containers ever saw them.

My real setup

Proxmox 8.1 on a 2018-era Xeon E-2174G, 64 GB RAM, 1 Gbps symmetric fibre.
One VM (Debian 12, 5.15 kernel) that runs everything: Docker 24, Traefik 3.0, and about 40 micro-services.
nftables ruleset I built myself (no firewalld, no ufw). It’s only 30 lines: allow established, ssh, http(s), then drop the rest.
Traefik listens on 80/443 and routes >1 000 requests/minute to 15 different containers. Some of those containers call each other through Traefik as well (I know, not ideal, but it keeps TLS simple).

What worked (and why)

1. Confirm the table was actually full

cat /proc/sys/net/netfilter/nf_conntrack_count
131060
cat /proc/sys/net/netfilter/nf_conntrack_max
131072

16 entries left—basically zero headroom. I also checked the breakdown:

awk '$4 == "TIME_WAIT" {tw++} $4 == "ESTABLISHED" {est++} END {print "TW:",tw,"EST:",est}' /proc/net/nf_conntrack
TW: 86542 EST: 28432

TIME_WAIT dominated. That told me connections were closing properly but the default 120 s timeout was keeping them in the table too long for this traffic pattern.

2. Bump the limit immediately

I doubled it first to survive the night:

sysctl -w net.netfilter.nf_conntrack_max=262144

Packet drops stopped within seconds. Errors in Traefik dropped to zero. That confirmed the diagnosis, but 262 k still felt tight for a box that might see 5–10 k new connections per minute during peak cron jobs.

3. Make the change persistent and add timeouts

I added a drop-in file /etc/sysctl.d/90-conntrack.conf:

net.netfilter.nf_conntrack_max = 524288
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 15
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 15

Then sysctl --system. I picked 512 k because memory is cheap on this box (each entry ≈ 320 bytes, so ~160 MB worst case) and I never want to think about it again.

4. Enable HTTP keep-alive inside the Docker network

Most of my services speak HTTP/1.1 but I’d never enabled keep-alive in their reverse-proxy configs. I edited the three busiest services to reuse connections:

# whoami service (nginx) snippet
upstream traefik {
    server traefik:80;
    keepalive 32;
}

Connection churn dropped by ~40 %, which lowered the rate new entries were inserted into the table.

5. Added a tiny Prometheus alert

I already run node-exporter in a container. I added a one-line recording rule:

- alert: ConntrackTableUsage
  expr: node_nf_conntrack_entries / node_nf_conntrack_entries_limit > 0.8

It fired once during the next night’s backup job; headroom was still fine, but now I’ll know before users do.

What didn’t work

Shortening tcp_timeout_established below a day. I tried 12 h first; next morning a long-running websocket kept reconnecting every few minutes. 24 h (default) keeps my home-assistant socket happy.
Switching to nftables “flow offload” objects. Debian 12 ships a 5.15 kernel; the flow table module exists but I couldn’t get it to attach to my bridge interface without breaking Docker’s userland proxy. I gave up after two reboots—raising the limit was easier.
Disabling conntrack entirely. I attempted notrack for the internal Docker bridge. Traffic between containers stopped cold because Docker relies on NAT for the userland-proxy hair-pinning. I reverted in ten minutes.

Key takeaways

Random 502/time-out errors that don’t appear in application logs often originate one layer below—check dmesg first.
On a box that terminates thousands of short-lived HTTP calls, the default 128 k conntrack table is too small. A single evening of cron scripts plus health checks can chew through it.
TIME_WAIT entries are the biggest consumer in my workload; cutting that timeout from 120 s to 30 s freed half the table with no side effects.
Keep-alive between internal services is worth the five-minute config change—it halves connection churn and buys headroom.
Make the limit big once, monitor it, and move on. Conntrack tuning isn’t glamorous, but it beats 3 a.m. pages.

Tech Expert & Vibe Coder

Why I worked on this

My real setup

What worked (and why)

1. Confirm the table was actually full

2. Bump the limit immediately

3. Make the change persistent and add timeouts

4. Enable HTTP keep-alive inside the Docker network

5. Added a tiny Prometheus alert

What didn’t work

Key takeaways

Category:

implementing rate limiting...

setting up caddy as a...

Leave a Comment Cancel reply

Categories

Related Posts

implementing rate limiting for self-hosted api...

setting up caddy as a transparent proxy for...

building automated firewall rule testing with...

About Me

Vipin PG

Tech Expert & Vibe Coder

Debugging nftables conntrack table exhaustion when running high-throughput Traefik reverse proxy with thousands of concurrent Docker container connections

Why I worked on this

My real setup

What worked (and why)

1. Confirm the table was actually full

2. Bump the limit immediately

3. Make the change persistent and add timeouts

4. Enable HTTP keep-alive inside the Docker network

5. Added a tiny Prometheus alert

What didn’t work

Key takeaways

Category:

implementing rate limiting...

setting up caddy as a...

Leave a Comment Cancel reply

Subscribe to Newsletter

Categories

Related Posts

implementing rate limiting for self-hosted api...

setting up caddy as a transparent proxy for...

building automated firewall rule testing with...

About Me

Vipin PG