Debugging Docker Compose Service Dependencies When Healthchecks Pass But Applications Fail to Connect

Why I Started Looking Into This

I run multiple Docker Compose stacks on my Proxmox homelab—some for n8n workflows, some for monitoring tools, and a few experimental projects. The pattern was always the same: run docker-compose up, watch the web container crash 2-3 times with “connection refused” errors, wait about 20 seconds, then everything works fine.

It annoyed me, but I lived with it. Until I added a new container that needed to connect to PostgreSQL during its initialization phase. That container would fail, exit, and never restart automatically. I had to manually restart the stack every single time.

Debugging Docker Compose Service Dependencies When Healthchecks Pass But Applications Fail to Connect

That’s when I stopped accepting “just restart it” as a solution and actually dug into why depends_on wasn’t doing what I thought it should.

What I Actually Learned About depends_on

The Docker documentation is clear about this, but I had never read it carefully:

depends_on only controls startup order, not readiness.

When you write depends_on: db, Docker starts the database container first. But “started” just means the container process is running. It doesn’t mean PostgreSQL has finished initializing, loading configurations, or is ready to accept connections.

For PostgreSQL specifically, here’s what happens:

Container starts (depends_on releases here)
PostgreSQL initializes the data directory
Loads configuration files
Runs any init scripts
Finally opens the port and accepts connections

That whole process takes 10-15 seconds on my setup. If your application tries to connect at second 2, it fails.

The Three Conditions Nobody Talks About

I didn’t know this until I read the Compose spec carefully: depends_on supports three different conditions.

services:
  web:
    depends_on:
      db:
        condition: service_started  # Default
        # condition: service_healthy  # What we actually need
        # condition: service_completed_successfully  # For init containers

service_started is the default. It continues as soon as the container is running, which is why we see connection failures.

service_healthy waits for the container’s healthcheck to pass before starting dependent services. This is what I needed.

service_completed_successfully waits for the container to exit with code 0. I use this for migration containers that need to run once before the main app starts.

My PostgreSQL Healthcheck Configuration

PostgreSQL’s official image includes pg_isready, which checks if the database is actually accepting connections. Here’s what I use:

version: '3.8'

services:
  web:
    image: node:20-alpine
    depends_on:
      db:
        condition: service_healthy
        restart: true
    environment:
      DATABASE_URL: postgresql://postgres:password@db:5432/mydb
    command: npm start

  db:
    image: postgres:16
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: password
      POSTGRES_DB: mydb
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres -d mydb"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 60s
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Why These Specific Values

interval: 10s — Checks every 10 seconds. I tried 5s initially but it felt excessive for my use case.

timeout: 5s — If pg_isready hangs for more than 5 seconds, something is wrong. This has never actually triggered in my setup.

retries: 3 — Three consecutive failures before marking unhealthy. This handles brief network hiccups or database load spikes.

start_period: 60s — This is the most important one. Failures during the first 60 seconds don’t count toward the retry limit. PostgreSQL needs time to initialize, and without this grace period, the container would be marked unhealthy before it even finished starting.

I initially set start_period to 30s and saw occasional false negatives on slower hardware. 60s has been reliable across my Proxmox VMs and even on my Synology NAS.

The -U and -d Flags Matter

I originally used just pg_isready without arguments. It worked, but filled my logs with warnings about connecting as the wrong user. Adding -U postgres -d mydb specifies the actual user and database, which silences those warnings and makes the check more accurate.

What Didn’t Work

Using Shell Scripts

I tried the popular wait-for-it.sh approach first—adding a script to the container that polls the database before starting the application. It worked, but it felt messy. I had to add the script to my images, modify entrypoints, and maintain extra code that Docker’s healthcheck system already handles better.

Short start_period Values

My first healthcheck used start_period: 15s. On my main Proxmox host with NVMe storage, it worked fine. But when I deployed the same stack to my older backup server with spinning disks, PostgreSQL took 25 seconds to initialize. The container was marked unhealthy before it even had a chance.

I bumped it to 60s everywhere. Better to wait a bit longer than deal with intermittent failures.

Checking Port Availability Instead of Service Health

I experimented with using nc -z localhost 5432 to check if the port was open. The port opens before PostgreSQL is fully ready, so this check passed too early and my application still crashed.

pg_isready actually attempts a connection handshake, which is a much better indicator of readiness.

MySQL Is Similar But Different

I run MySQL for one legacy project. The healthcheck uses mysqladmin ping instead:

db:
  image: mysql:8.0
  environment:
    MYSQL_ROOT_PASSWORD: password
    MYSQL_DATABASE: mydb
  healthcheck:
    test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-ppassword"]
    interval: 10s
    timeout: 5s
    retries: 3
    start_period: 60s

Note the -ppassword with no space. That’s how mysqladmin expects it.

The Password Environment Variable Trap

I wanted to avoid hardcoding the password, so I tried:

test: ["CMD-SHELL", "mysqladmin ping -u root -p$MYSQL_ROOT_PASSWORD"]

This failed because Docker Compose interpolates $MYSQL_ROOT_PASSWORD on my host machine before the container even starts. The container received the literal value from my host environment, not the container’s environment.

The fix is using $$:

test: ["CMD-SHELL", "mysqladmin ping -u root -p$$MYSQL_ROOT_PASSWORD"]

The double dollar sign tells Compose to leave it alone and let the container’s shell parse it.

Debugging Unhealthy Containers

When a container stays unhealthy, I check the health status directly:

docker inspect --format='{{json .State.Health}}' container_name | jq

This shows the actual healthcheck output, exit codes, and timestamps. Most of my issues were either:

Wrong username or database name in the healthcheck command
start_period too short
Actual PostgreSQL startup failures (usually permission issues with mounted volumes)

What I Do Now

Every new Docker Compose stack I create includes healthchecks from the start. It takes 5 minutes to configure and eliminates an entire class of startup race conditions.

My standard template:

Database containers: Always have healthchecks with 60s start_period
Application containers: Use condition: service_healthy in depends_on
Migration containers: Use condition: service_completed_successfully

I also set restart: true in the depends_on block so if the database container restarts, dependent containers restart too. This has saved me from subtle issues where the database restarted but the application kept a stale connection.

Key Takeaways

depends_on alone is not enough. It only controls startup order, not readiness. Use condition: service_healthy to wait for actual service availability.

start_period matters more than you think. Set it high enough to cover slow hardware and initialization scripts. 60 seconds is a safe default for databases.

Use the right tool for the check. PostgreSQL has pg_isready, MySQL has mysqladmin ping. Don’t reinvent these with shell scripts or port checks.

Test on your slowest hardware. A healthcheck that works on your development machine might fail on slower production hardware or when the system is under load.

The goal isn’t perfect startup orchestration—it’s predictable, reliable container initialization without manual intervention. Healthchecks get you there.

Tech Expert & Vibe Coder

Debugging Docker Compose Service Dependencies When Healthchecks Pass But Applications Fail to Connect

Why I Started Looking Into This

What I Actually Learned About depends_on

The Three Conditions Nobody Talks About

My PostgreSQL Healthcheck Configuration

Why These Specific Values

The -U and -d Flags Matter

What Didn’t Work

Using Shell Scripts

Short start_period Values

Checking Port Availability Instead of Service Health

MySQL Is Similar But Different

The Password Environment Variable Trap

Debugging Unhealthy Containers

What I Do Now

Key Takeaways

Leave a Comment Cancel reply

Search Articles

Categories

About the Author

Vipin PG

Tech Expert & Vibe Coder

Why I Started Looking Into This

What I Actually Learned About depends_on

The Three Conditions Nobody Talks About

My PostgreSQL Healthcheck Configuration

Why These Specific Values

The -U and -d Flags Matter

What Didn’t Work

Using Shell Scripts

Short start_period Values

Checking Port Availability Instead of Service Health

MySQL Is Similar But Different

The Password Environment Variable Trap

Debugging Unhealthy Containers

What I Do Now

Key Takeaways

Building Encrypted Docker Volume Backups to Tailscale Taildrop: Automating Offsite Container Data Protection

Setting Up Docker Buildkit Remote Cache with MinIO: Speeding Up Multi-Stage Builds in Homelab CI Pipelines

Leave a Comment Cancel reply

Search Articles

Categories

About the Author

Vipin PG

Related articles

Debugging Docker Compose Healthcheck Failures with Fish Shell Flag Explainers: ...

Building Automated Container Rollback Pipelines: Using Docker Compose Watch...

Debugging Container Timezone and Locale Inconsistencies Across Multi-region...

Get new posts and practical tech notes in your inbox.