Tech Expert & Vibe Coder

With 15+ years of experience, I specialize in self-hosting, AI automation, and Vibe Coding – building applications using AI-powered tools like Google Antigravity, Dyad, and Cline. From homelabs to enterprise solutions.

Building a Multi-Stage Docker Compose Setup for Zero-Downtime MongoDB Security Updates

Why I Needed This

I run MongoDB in Docker containers for several self-hosted services on my Proxmox cluster. When MongoDB announced critical security patches last year, I had a problem: my containers were running production workloads, and I couldn’t afford downtime during business hours. The standard “stop, update, restart” approach would break active connections and potentially corrupt in-flight writes.

I needed a way to update MongoDB containers without interrupting service. This meant building a multi-stage setup where I could spin up a new container, sync data, switch traffic, and only then kill the old one.

My Initial Setup

Before this work, I had a straightforward Docker Compose file running MongoDB 6.0.x:

version: '3.8'
services:
  mongodb:
    image: mongo:6.0
    container_name: mongodb_prod
    volumes:
      - /mnt/docker-data/mongodb:/data/db
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
    restart: unless-stopped

This worked fine for normal operations, but updating meant downtime. My services—n8n workflows, a custom inventory tracker, and some automation scripts—would all lose their database connections during the update window.

What I Built

The Multi-Stage Approach

I decided to run two MongoDB containers temporarily: the existing one (primary) and a new one (secondary) with the updated image. The plan was:

  1. Start the new container with the updated MongoDB version
  2. Initialize it as a replica set member
  3. Let it sync data from the primary
  4. Switch application connections to the new container
  5. Remove the old container

This required MongoDB’s replica set feature, which I hadn’t used before in my self-hosted setup.

Modified Docker Compose Configuration

Here’s the compose file I used during the transition:

version: '3.8'

services:
  mongodb_primary:
    image: mongo:6.0.11
    container_name: mongodb_primary
    command: ["--replSet", "rs0", "--bind_ip_all"]
    volumes:
      - /mnt/docker-data/mongodb:/data/db
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
    networks:
      - mongo_network
    restart: unless-stopped

  mongodb_secondary:
    image: mongo:6.0.13  # Updated version with security patches
    container_name: mongodb_secondary
    command: ["--replSet", "rs0", "--bind_ip_all"]
    volumes:
      - /mnt/docker-data/mongodb_secondary:/data/db
    ports:
      - "27018:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
    networks:
      - mongo_network
    restart: unless-stopped
    depends_on:
      - mongodb_primary

networks:
  mongo_network:
    driver: bridge

Key changes:

  • Added --replSet rs0 to enable replica set mode
  • Used --bind_ip_all so containers could talk to each other
  • Separate data volumes to avoid conflicts
  • Different external ports (27017 and 27018) for testing
  • Shared Docker network for internal communication

Initializing the Replica Set

After starting both containers, I had to configure them as a replica set. I connected to the primary:

docker exec -it mongodb_primary mongosh -u admin -p ${MONGO_PASSWORD}

Then initialized the replica set:

rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "mongodb_primary:27017", priority: 2 },
    { _id: 1, host: "mongodb_secondary:27017", priority: 1 }
  ]
})

The priority values ensured the primary stayed primary during normal operation. The secondary started syncing immediately.

Monitoring the Sync

I watched the sync progress with:

rs.status()

This showed replication lag. For my ~8GB database, it took about 12 minutes to fully sync. The stateStr field changed from STARTUP2 to SECONDARY when ready.

The Switchover Process

Testing the Secondary

Before switching production traffic, I tested the secondary by pointing a test n8n instance to port 27018. I ran a few workflows that wrote and read data. Everything worked, but there was a catch: the secondary was read-only by default.

To allow reads during testing, I had to explicitly enable it:

db.getMongo().setReadPref("secondaryPreferred")

This wasn’t ideal for production, but confirmed the data was intact.

Promoting the Secondary

When I was ready to switch, I forced the secondary to become primary:

rs.stepDown()  # On the old primary

This triggered an election. Because I’d set the secondary’s priority lower, I had to manually adjust it first:

cfg = rs.conf()
cfg.members[1].priority = 3
rs.reconfig(cfg)

Then stepped down the old primary. The secondary became primary within 10-15 seconds.

Updating Application Connections

I updated my application connection strings to point to the new primary. For n8n, this meant changing the MongoDB URL in the environment variables:

DB_MONGODB_CONNECTION_URL=mongodb://admin:${MONGO_PASSWORD}@mongodb_secondary:27017/n8n?authSource=admin

I restarted the services one by one. Because MongoDB clients handle replica set failovers automatically, most connections recovered without errors.

What Didn’t Work

Initial Sync Failures

My first attempt failed because I didn’t allocate enough disk space for the secondary’s data volume. MongoDB’s initial sync creates a full copy, and I’d underestimated the space needed. The sync stopped at 85% with a “no space left” error.

I had to delete the secondary’s data directory, expand the LVM volume on the Proxmox host, and restart the sync.

Connection String Confusion

I initially tried using a replica set connection string like:

mongodb://admin:pass@mongodb_primary:27017,mongodb_secondary:27017/dbname?replicaSet=rs0

This should have worked, but my applications couldn’t resolve the container hostnames from outside the Docker network. I had to either:

  • Use IP addresses instead of hostnames
  • Add the containers to the host’s network (not ideal)
  • Switch connections manually (what I ended up doing)

The replica set connection string would have been cleaner, but my network setup wasn’t ready for it.

Write Concerns During Transition

Some writes failed during the ~15 seconds when the old primary stepped down and the new one took over. MongoDB clients retried automatically, but I saw a spike in my n8n error logs. For truly critical writes, I should have paused workflows during the switchover.

Removing the Old Container

Once everything was stable on the new primary, I removed the old container from the replica set:

rs.remove("mongodb_primary:27017")

Then stopped and removed the container:

docker-compose stop mongodb_primary
docker-compose rm mongodb_primary

I kept the old data volume for a week as a backup before deleting it.

Final Simplified Setup

After the migration, I simplified the compose file back to a single container:

version: '3.8'
services:
  mongodb:
    image: mongo:6.0.13
    container_name: mongodb_prod
    volumes:
      - /mnt/docker-data/mongodb_secondary:/data/db  # Using the new data
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: ${MONGO_PASSWORD}
    restart: unless-stopped

I removed the replica set configuration since I didn’t need it for normal operations. MongoDB automatically handled the transition from replica set to standalone mode.

Key Takeaways

  • Replica sets work well for zero-downtime updates, but they add complexity. For my small self-hosted setup, it’s overkill for daily use.
  • Disk space matters. Initial sync requires enough space for a full copy of the data.
  • Network configuration is critical. Container hostnames only work within Docker networks, which complicates replica set connection strings.
  • Testing is non-negotiable. I should have tested the entire process on a staging setup first. I got lucky that my mistakes were recoverable.
  • The switchover window isn’t truly zero. There’s a 10-20 second gap during election where writes may fail. For my use case, that was acceptable.

This approach worked for a security update, but I wouldn’t use it for every MongoDB upgrade. For minor patches, I now schedule brief maintenance windows. For major version upgrades or critical security fixes, the multi-stage replica set method is worth the effort.

The entire process took about 3 hours including testing and cleanup. Most of that was waiting for the initial sync and verifying data integrity. The actual downtime was under 30 seconds.

Leave a Comment

Your email address will not be published. Required fields are marked *