Why I Needed This
I run MongoDB in Docker containers for several self-hosted services on my Proxmox cluster. When MongoDB announced critical security patches last year, I had a problem: my containers were running production workloads, and I couldn’t afford downtime during business hours. The standard “stop, update, restart” approach would break active connections and potentially corrupt in-flight writes.
I needed a way to update MongoDB containers without interrupting service. This meant building a multi-stage setup where I could spin up a new container, sync data, switch traffic, and only then kill the old one.
My Initial Setup
Before this work, I had a straightforward Docker Compose file running MongoDB 6.0.x:
version: '3.8'
services:
mongodb:
image: mongo:6.0
container_name: mongodb_prod
volumes:
- /mnt/docker-data/mongodb:/data/db
ports:
- "27017:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
restart: unless-stopped
This worked fine for normal operations, but updating meant downtime. My services—n8n workflows, a custom inventory tracker, and some automation scripts—would all lose their database connections during the update window.
What I Built
The Multi-Stage Approach
I decided to run two MongoDB containers temporarily: the existing one (primary) and a new one (secondary) with the updated image. The plan was:
- Start the new container with the updated MongoDB version
- Initialize it as a replica set member
- Let it sync data from the primary
- Switch application connections to the new container
- Remove the old container
This required MongoDB’s replica set feature, which I hadn’t used before in my self-hosted setup.
Modified Docker Compose Configuration
Here’s the compose file I used during the transition:
version: '3.8'
services:
mongodb_primary:
image: mongo:6.0.11
container_name: mongodb_primary
command: ["--replSet", "rs0", "--bind_ip_all"]
volumes:
- /mnt/docker-data/mongodb:/data/db
ports:
- "27017:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
networks:
- mongo_network
restart: unless-stopped
mongodb_secondary:
image: mongo:6.0.13 # Updated version with security patches
container_name: mongodb_secondary
command: ["--replSet", "rs0", "--bind_ip_all"]
volumes:
- /mnt/docker-data/mongodb_secondary:/data/db
ports:
- "27018:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
networks:
- mongo_network
restart: unless-stopped
depends_on:
- mongodb_primary
networks:
mongo_network:
driver: bridge
Key changes:
- Added
--replSet rs0to enable replica set mode - Used
--bind_ip_allso containers could talk to each other - Separate data volumes to avoid conflicts
- Different external ports (27017 and 27018) for testing
- Shared Docker network for internal communication
Initializing the Replica Set
After starting both containers, I had to configure them as a replica set. I connected to the primary:
docker exec -it mongodb_primary mongosh -u admin -p ${MONGO_PASSWORD}
Then initialized the replica set:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "mongodb_primary:27017", priority: 2 },
{ _id: 1, host: "mongodb_secondary:27017", priority: 1 }
]
})
The priority values ensured the primary stayed primary during normal operation. The secondary started syncing immediately.
Monitoring the Sync
I watched the sync progress with:
rs.status()
This showed replication lag. For my ~8GB database, it took about 12 minutes to fully sync. The stateStr field changed from STARTUP2 to SECONDARY when ready.
The Switchover Process
Testing the Secondary
Before switching production traffic, I tested the secondary by pointing a test n8n instance to port 27018. I ran a few workflows that wrote and read data. Everything worked, but there was a catch: the secondary was read-only by default.
To allow reads during testing, I had to explicitly enable it:
db.getMongo().setReadPref("secondaryPreferred")
This wasn’t ideal for production, but confirmed the data was intact.
Promoting the Secondary
When I was ready to switch, I forced the secondary to become primary:
rs.stepDown() # On the old primary
This triggered an election. Because I’d set the secondary’s priority lower, I had to manually adjust it first:
cfg = rs.conf()
cfg.members[1].priority = 3
rs.reconfig(cfg)
Then stepped down the old primary. The secondary became primary within 10-15 seconds.
Updating Application Connections
I updated my application connection strings to point to the new primary. For n8n, this meant changing the MongoDB URL in the environment variables:
DB_MONGODB_CONNECTION_URL=mongodb://admin:${MONGO_PASSWORD}@mongodb_secondary:27017/n8n?authSource=admin
I restarted the services one by one. Because MongoDB clients handle replica set failovers automatically, most connections recovered without errors.
What Didn’t Work
Initial Sync Failures
My first attempt failed because I didn’t allocate enough disk space for the secondary’s data volume. MongoDB’s initial sync creates a full copy, and I’d underestimated the space needed. The sync stopped at 85% with a “no space left” error.
I had to delete the secondary’s data directory, expand the LVM volume on the Proxmox host, and restart the sync.
Connection String Confusion
I initially tried using a replica set connection string like:
mongodb://admin:pass@mongodb_primary:27017,mongodb_secondary:27017/dbname?replicaSet=rs0
This should have worked, but my applications couldn’t resolve the container hostnames from outside the Docker network. I had to either:
- Use IP addresses instead of hostnames
- Add the containers to the host’s network (not ideal)
- Switch connections manually (what I ended up doing)
The replica set connection string would have been cleaner, but my network setup wasn’t ready for it.
Write Concerns During Transition
Some writes failed during the ~15 seconds when the old primary stepped down and the new one took over. MongoDB clients retried automatically, but I saw a spike in my n8n error logs. For truly critical writes, I should have paused workflows during the switchover.
Removing the Old Container
Once everything was stable on the new primary, I removed the old container from the replica set:
rs.remove("mongodb_primary:27017")
Then stopped and removed the container:
docker-compose stop mongodb_primary
docker-compose rm mongodb_primary
I kept the old data volume for a week as a backup before deleting it.
Final Simplified Setup
After the migration, I simplified the compose file back to a single container:
version: '3.8'
services:
mongodb:
image: mongo:6.0.13
container_name: mongodb_prod
volumes:
- /mnt/docker-data/mongodb_secondary:/data/db # Using the new data
ports:
- "27017:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: admin
MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
restart: unless-stopped
I removed the replica set configuration since I didn’t need it for normal operations. MongoDB automatically handled the transition from replica set to standalone mode.
Key Takeaways
- Replica sets work well for zero-downtime updates, but they add complexity. For my small self-hosted setup, it’s overkill for daily use.
- Disk space matters. Initial sync requires enough space for a full copy of the data.
- Network configuration is critical. Container hostnames only work within Docker networks, which complicates replica set connection strings.
- Testing is non-negotiable. I should have tested the entire process on a staging setup first. I got lucky that my mistakes were recoverable.
- The switchover window isn’t truly zero. There’s a 10-20 second gap during election where writes may fail. For my use case, that was acceptable.
This approach worked for a security update, but I wouldn’t use it for every MongoDB upgrade. For minor patches, I now schedule brief maintenance windows. For major version upgrades or critical security fixes, the multi-stage replica set method is worth the effort.
The entire process took about 3 hours including testing and cleanup. Most of that was waiting for the initial sync and verifying data integrity. The actual downtime was under 30 seconds.