Why I Migrated from Docker Compose to Swarm Mode
I've been running Docker Compose stacks for years—mostly on a single Proxmox node with various services like n8n, monitoring tools, and internal apps. The setup worked fine until I started hitting situations where I needed to update a container without taking the service offline.
With standalone Compose, updating meant stopping the old container and starting the new one. Even if it only took 30 seconds, that's still downtime. For personal services, I could live with it. But once I started hosting tools that other people in my household relied on, those brief outages became annoying.
I looked at Swarm mode because it's built into Docker Engine and doesn't require learning an entirely new orchestration system. The promise was simple: rolling updates with zero downtime. I wanted to test if that actually worked in practice.
My Real Setup Before Migration
I had a single-node Proxmox VM running Docker Engine with about a dozen services defined across several compose.yaml files. Each stack was self-contained:
- A web app with Redis backend (similar to the example in Docker's docs)
- n8n with a PostgreSQL database
- Monitoring stack with Prometheus and Grafana
- A few internal tools with persistent volumes
Everything used bind mounts or named volumes. No external registry—I built images locally and referenced them directly in Compose files.
The migration goal was to move these stacks to Swarm mode without taking services offline, then use rolling updates for future changes.
What I Had to Change
Setting Up a Local Registry
Swarm mode requires images to be available in a registry because nodes pull images independently. Even with a single-node swarm, the deployment process expects this.
I started a throwaway registry as a Swarm service:
docker service create --name registry --publish published=5000,target=5000 registry:2
This worked, but I hit an issue immediately: my existing Compose files referenced images by name only (like my-app:latest), not by registry address. I had to rebuild and tag them properly:
docker build -t 127.0.0.1:5000/my-app:latest . docker push 127.0.0.1:5000/my-app:latest
This step was tedious but necessary. I scripted it for each service so I wouldn't forget later.
Adjusting Compose Files for Swarm
Swarm mode uses Compose file version 3, which is older and has different syntax than the current Compose spec. Some directives I used regularly don't work in Swarm:
build:is ignored (you must push images beforehand)depends_on:doesn't control startup order in Swarmrestart:is replaced bydeploy.restart_policy
I had to rewrite sections of my Compose files. For example, this standalone Compose config:
services:
web:
build: .
ports:
- "8000:8000"
restart: unless-stopped
Became this for Swarm:
services:
web:
image: 127.0.0.1:5000/my-app:latest
ports:
- "8000:8000"
deploy:
replicas: 1
restart_policy:
condition: on-failure
I also had to add deploy: blocks to control update behavior:
deploy:
update_config:
parallelism: 1
delay: 10s
order: start-first
The order: start-first setting was critical—it starts the new container before stopping the old one, which is what prevents downtime.
Migrating Without Downtime
To migrate a running Compose stack to Swarm without taking it offline, I followed this process:
- Initialized Swarm mode on the node:
docker swarm init - Deployed the stack using the modified Compose file:
docker stack deploy --compose-file compose.yaml my-stack - Waited for the Swarm service to start (checked with
docker stack services my-stack) - Once the Swarm service was healthy, stopped the old Compose stack:
docker compose down
This worked because both the Compose stack and the Swarm stack could run simultaneously as long as they didn't conflict on ports. I mapped the Swarm service to a different port temporarily, tested it, then switched traffic over.
For services with databases, I had to be more careful. I stopped writes to the Compose version, let the Swarm service start with the same volume, then verified data consistency before removing the old containers.
Testing Rolling Updates
Once everything was running in Swarm mode, I tested the actual rolling update process.
I made a small change to one of my apps (updated a Python dependency), rebuilt the image, pushed it to the registry with a new tag, and updated the Compose file to reference the new tag:
image: 127.0.0.1:5000/my-app:v2
Then I redeployed:
docker stack deploy --compose-file compose.yaml my-stack
Swarm detected the image change and started a new container. Because I had set order: start-first, the new container came up, passed health checks, and only then did Swarm stop the old container.
I monitored this with:
docker service ps my-stack_web --no-trunc
The output showed both the old and new tasks running briefly, then the old one shutting down. Total observed downtime: zero. The service stayed reachable the entire time.
What Didn't Work
Health Checks Are Critical
My first rolling update failed because I didn't define a proper health check in the Dockerfile. Swarm started the new container, assumed it was healthy immediately, and killed the old one. The new container was still initializing, so there was a brief outage.
I fixed this by adding a health check:
HEALTHCHECK --interval=10s --timeout=3s --start-period=30s \ CMD curl -f http://localhost:8000/health || exit 1
Swarm now waits for the health check to pass before routing traffic to the new container.
Persistent Data Requires Planning
Rolling updates work great for stateless services. For services with databases or persistent volumes, I had to think through the migration more carefully.
I ran into an issue with a PostgreSQL container where Swarm tried to start a new replica while the old one was still writing to the volume. This caused a lock conflict and the new container failed to start.
The solution was to set replicas: 1 and update_config.parallelism: 1, which forces Swarm to stop the old container before starting the new one. This reintroduces brief downtime, but it's unavoidable for single-instance databases.
For truly zero-downtime database updates, I'd need to run a clustered setup, which I haven't done yet.
Networking Behavior Changed
In standalone Compose, I used custom bridge networks to isolate stacks. Swarm creates overlay networks instead, which behave differently.
I noticed that DNS resolution between services was slower in Swarm mode—sometimes taking a few seconds on the first request. This didn't break anything, but it was noticeable.
I also had to adjust firewall rules because Swarm uses different ports for internal communication (like port 7946 for gossip protocol).
Key Takeaways
- Rolling updates in Swarm mode actually work as advertised—I've had zero-downtime deployments for stateless services.
- The migration from Compose to Swarm is straightforward but requires discipline: you must push images to a registry, adjust Compose syntax, and define proper health checks.
- For single-node setups, Swarm adds complexity without much benefit unless you specifically need rolling updates. If you're happy with brief downtime, standalone Compose is simpler.
- Stateful services (databases, message queues) are harder to update without downtime. Rolling updates help, but you still need to plan for data consistency.
- Swarm mode is a middle ground between standalone Docker and full Kubernetes. It's less powerful but also less complicated.
I'm still running most of my services in Swarm mode because the rolling update capability is genuinely useful. But I don't pretend it's a silver bullet—it's just another tool that works well for specific use cases.