Why I Built This (And Why I Stopped)
I wanted to solve a real problem: my single Jellyfin instance would choke when multiple users tried to transcode 4K HDR content at the same time. The CPU would spike, playback would stutter, and the whole server would become unresponsive. I thought the solution was obvious—build a multi-node cluster with HAProxy handling session persistence so each user's stream would stick to one node.
I spent weeks on this. I got it working. And then I realized it was the wrong approach entirely.
My Real Setup
I run Jellyfin in an LXC container on Proxmox, backed by a Synology NAS for media storage. The initial setup was straightforward:
- Single Jellyfin instance (Ubuntu 18.04 LXC)
- Intel CPU with QuickSync for hardware transcoding
- HAProxy on pfSense for reverse proxy
- NFS mounts from Synology for media access
When I decided to build the cluster, I spun up three identical Jellyfin nodes, each with access to the same media library via NFS. The architecture looked clean on paper:
- HAProxy frontend listening on ports 80/443
- Three Jellyfin backends (192.168.50.111, .112, .113)
- Session persistence using source IP hashing
- Health checks via /health endpoint
I configured HAProxy with this backend setup:
backend jellyfin
balance source
hash-type consistent
option httpchk
option forwardfor
http-check send meth GET uri /health
http-check expect string Healthy
server jellyfin1 192.168.50.111:8096 check
server jellyfin2 192.168.50.112:8096 check
server jellyfin3 192.168.50.113:8096 check
The balance source directive was supposed to ensure that each client IP always hit the same backend node. This mattered because Jellyfin's transcoding sessions are stateful—if a client switches nodes mid-stream, the transcode job dies and playback fails.
What Actually Happened
The cluster worked, technically. HAProxy distributed connections. Session persistence mostly held. Each node could transcode independently. But the problems started immediately:
Database Synchronization Was a Nightmare
Jellyfin stores its database in SQLite by default. I tried running three nodes against the same NFS-mounted database file. This was a mistake. SQLite locks the entire database for writes, and over NFS, the locking behavior became unpredictable. Sometimes writes would succeed. Sometimes they'd hang. Sometimes the database would corrupt.
I switched to having each node maintain its own local database, with a script to sync them every few minutes. This created new problems: watch status would drift between nodes, new library items wouldn't appear consistently, and user settings would occasionally revert.
Transcode Cache Collisions
Each Jellyfin node creates temporary transcode files. I initially pointed all nodes at a shared transcode directory on NFS. Bad idea. Nodes would occasionally try to write to the same temp file, causing transcode failures. I moved to local transcode directories, but this meant wasted disk space and CPU cycles—if two users on different nodes watched the same 4K file, both nodes would transcode it separately.
Session Persistence Wasn't Reliable Enough
HAProxy's source IP hashing worked most of the time, but not always. If a client's IP changed (mobile users switching between WiFi and cellular), the session would break. If HAProxy detected a node as unhealthy and failed over, the client would lose their transcode session. If a user paused for too long and the connection dropped, resuming would sometimes hit a different node.
I tried cookie-based persistence, but Jellyfin's web client and mobile apps didn't always respect the cookies consistently.
Hardware Transcoding Didn't Scale
Each node had access to the same Intel QuickSync device via VA-API. But QuickSync has limits—it can handle maybe 10-15 simultaneous transcodes depending on resolution and codec. Spreading users across three nodes didn't increase this limit. It just meant each node was underutilized most of the time, and when transcoding demand spiked, all three nodes would hit their limits simultaneously.
What I Should Have Done Instead
After months of fighting this setup, I tore it down and went back to a single Jellyfin instance with these changes:
- Upgraded to a CPU with better QuickSync support (11th gen Intel)
- Moved transcoding to a dedicated NVMe drive instead of NFS
- Set strict transcode limits per user
- Encouraged direct play wherever possible by optimizing media files upfront
The single-node setup has been far more reliable. No database sync issues. No session persistence problems. No wasted transcoding work. And when the server does hit its limits, I know exactly where the bottleneck is.
Why Multi-Node Jellyfin Doesn't Make Sense
Jellyfin wasn't designed to run as a distributed system. Its architecture assumes a single server with local storage and a local database. You can force it into a cluster, but you're fighting the design at every step.
The real problems with transcoding aren't solved by horizontal scaling:
- Hardware acceleration is device-limited, not CPU-limited
- Storage I/O is often the bottleneck, and NFS doesn't help
- Session state makes load balancing fragile
- Most users don't actually need transcoding if you optimize your media library
If you truly need more capacity, you're better off running multiple independent Jellyfin servers with separate libraries, or investing in better hardware for a single powerful node.
What I Learned
This project taught me to question my assumptions. I assumed that "more nodes = more capacity" because that's true for stateless web applications. But Jellyfin isn't stateless, and transcoding isn't a problem you can solve by throwing more servers at it.
I also learned that just because you can make something work doesn't mean you should. The multi-node setup technically functioned, but it was fragile, hard to maintain, and didn't actually solve the underlying problem.
The simplest solution—a single well-configured server—turned out to be the right one. Sometimes the boring answer is the correct answer.