Tech Expert & Vibe Coder

With 14+ years of experience, I specialize in self-hosting, AI automation, and Vibe Coding – building applications using AI-powered tools like Google Antigravity, Dyad, and Cline. From homelabs to enterprise solutions.

Implementing HAProxy health checks to load balance Jellyfin transcoding across multiple Proxmox VMs with Intel QuickSync

Why I Built This Setup

I run Jellyfin on multiple Proxmox VMs, each with Intel QuickSync for hardware transcoding. The problem was simple: transcoding load wasn't balanced. One VM would hit 100% CPU usage while others sat idle, causing playback stuttering and wasted resources.

I needed a way to distribute transcoding requests across VMs automatically and pull traffic away from any VM that went down or became overloaded. HAProxy seemed like the right tool, but I'd never used it for health checks on media servers before.

My Real Setup

I run three Proxmox VMs, each hosting a Jellyfin instance:

  • VM1: 192.168.1.101:8096
  • VM2: 192.168.1.102:8096
  • VM3: 192.168.1.103:8096

Each VM has an Intel iGPU passed through for QuickSync transcoding. All three share the same media library via NFS from my Synology NAS. Jellyfin's database is also shared, so any VM can serve the same content with the same metadata.

HAProxy runs on a separate lightweight LXC container in Proxmox (192.168.1.100), acting as the single entry point for all Jellyfin traffic.

Installing HAProxy

I installed HAProxy on an Ubuntu 24.04 LXC container:

sudo apt update
sudo apt install haproxy

I verified the installation:

sudo haproxy -v

Then enabled and started the service:

sudo systemctl enable haproxy
sudo systemctl start haproxy

The default config file lives at /etc/haproxy/haproxy.cfg. I backed it up before making changes:

sudo cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.backup

Configuring Load Balancing

I opened the config file:

sudo nano /etc/haproxy/haproxy.cfg

And added a frontend to accept incoming traffic on port 8096:

frontend jellyfin_front
    bind *:8096
    default_backend jellyfin_back

Then defined the backend with all three Jellyfin VMs:

backend jellyfin_back
    balance roundrobin
    option httpchk GET /health
    http-check expect string Healthy
    server vm1 192.168.1.101:8096 check inter 5000
    server vm2 192.168.1.102:8096 check inter 5000
    server vm3 192.168.1.103:8096 check inter 5000

The balance roundrobin line distributes requests evenly across VMs. The check inter 5000 option tells HAProxy to run health checks every 5 seconds.

I saved the file and restarted HAProxy:

sudo systemctl restart haproxy

Setting Up Jellyfin Health Checks

Jellyfin has a built-in health endpoint at /health. When the server is running normally, it returns a simple "Healthy" response. HAProxy uses this to determine if a VM should receive traffic.

I tested the endpoint manually on each VM:

curl http://192.168.1.101:8096/health
curl http://192.168.1.102:8096/health
curl http://192.168.1.103:8096/health

Each returned:

Healthy

This confirmed the endpoint was working. HAProxy's http-check expect string Healthy line looks for this exact response. If it doesn't get it, the VM is marked down and removed from the pool.

Testing Load Distribution

I pointed my browser to the HAProxy IP:

http://192.168.1.100:8096

Jellyfin loaded normally. I refreshed the page a few times and checked the HAProxy stats page to see which backend was handling requests. To enable stats, I added this to the config:

listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 10s

Then restarted HAProxy and opened:

http://192.168.1.100:8404/stats

The stats page showed all three VMs marked green and traffic being distributed roughly evenly.

What Worked

Load balancing itself worked immediately. Transcoding requests spread across VMs, and CPU usage stayed balanced. When I started multiple streams, each VM picked up roughly the same load.

Health checks also worked as expected. I stopped Jellyfin on VM2:

sudo systemctl stop jellyfin

Within 5 seconds, HAProxy marked VM2 as down on the stats page. New requests went only to VM1 and VM3. When I restarted Jellyfin on VM2, it came back online automatically.

The /health endpoint was reliable. It didn't just check if the port was open—it confirmed Jellyfin was actually responding.

What Didn't Work

Session persistence was a problem. Jellyfin uses cookies and session tokens, so when HAProxy switched a user between VMs mid-session, playback would break. The player would lose its position or fail to resume.

I tried adding sticky sessions using cookie directives:

backend jellyfin_back
    balance roundrobin
    cookie SERVERID insert indirect nocache
    server vm1 192.168.1.101:8096 check inter 5000 cookie vm1
    server vm2 192.168.1.102:8096 check inter 5000 cookie vm2
    server vm3 192.168.1.103:8096 check inter 5000 cookie vm3

This helped, but didn't fully solve it. Some clients (especially mobile apps) still had issues when a VM went down and HAProxy moved them to a different backend.

Another issue: health checks only verified that Jellyfin was running, not that transcoding was working. If a VM's QuickSync device failed or the transcoder crashed, HAProxy still sent traffic to it. I haven't found a clean way to check transcoding health without building a custom script.

Monitoring and Logs

I enabled HAProxy logging to track health check failures:

sudo nano /etc/rsyslog.d/49-haproxy.conf

Added:

$ModLoad imudp
$UDPServerRun 514
$UDPServerAddress 127.0.0.1

local2.* /var/log/haproxy.log

Then restarted rsyslog:

sudo systemctl restart rsyslog

Now I can see health check failures in real time:

sudo tail -f /var/log/haproxy.log

When a VM goes down, I see:

Server jellyfin_back/vm2 is DOWN, reason: Layer7 wrong status, code: 503

This helps me catch issues quickly.

Performance Impact

HAProxy itself uses almost no resources. The LXC container runs with 1GB RAM and a single CPU core, and it barely registers any load even with multiple active streams.

Transcoding performance didn't change. Each VM still uses QuickSync at the same speed. The only difference is that load is now distributed instead of concentrated on one VM.

Network latency added by HAProxy is negligible—under 1ms on my local network.

Key Takeaways

HAProxy is effective for basic load balancing across Jellyfin instances. Health checks work well for detecting when a VM is down or unresponsive.

Session persistence is tricky with media servers. Sticky sessions help, but they're not perfect. If a VM fails mid-stream, users will still experience interruptions.

Health checks only verify that Jellyfin is running, not that transcoding is functional. A more robust setup would need custom health scripts that test actual transcoding capability.

The stats page is invaluable for monitoring. I check it regularly to see if any VM is consistently slower or handling more load than others.

This setup isn't perfect, but it solved my immediate problem: unbalanced transcoding load. It's stable enough for daily use, and I can iterate on it as I learn more about HAProxy's capabilities.