Why I Built This Workflow
I run n8n in a Docker container on my Proxmox setup, and like most self-hosted services, it needs regular updates. The problem is that cron-triggered containers don't always fail loudly. A scheduled job might stop running, a container might restart silently, or a webhook might never fire—and I wouldn't know until I checked manually.
I needed a way to confirm that my critical automation containers were actually doing their jobs, not just appearing "healthy" in Portainer. That's where healthchecks.io came in. It's a simple service that expects regular pings from your systems. If a ping doesn't arrive on schedule, it alerts you.
The challenge was connecting Portainer's container state with healthchecks.io in a way that caught silent failures, not just container crashes.
My Setup
Here's what I was working with:
- n8n: Running in Docker, handling various automation workflows
- Portainer: Managing my containers via its API
- healthchecks.io: Free tier, monitoring several scheduled tasks
- Cron-triggered containers: Services that run on schedules (backups, scrapers, data sync jobs)
The containers themselves reported as "running" in Portainer, but that didn't mean their scheduled tasks were actually executing. I needed to verify execution, not just uptime.
How the Workflow Works
I built an n8n workflow that does three things:
1. Polls Portainer's API
The workflow runs every 15 minutes and queries Portainer's API to check the state of specific containers. I'm looking for:
- Container status (running, stopped, restarting)
- Last restart time
- Exit codes from recent runs
The HTTP Request node hits Portainer's endpoint:
GET https://portainer.local/api/endpoints/1/docker/containers/json
I filter the response to only include containers tagged with a specific label I added: healthcheck.monitor=true. This keeps the workflow focused on services that matter.
2. Checks for Silent Failures
This is where it gets specific. A container can be "running" but still failing silently if:
- It restarted recently due to an error
- Its last exit code was non-zero
- It's been running for less than 5 minutes (indicating a crash loop)
I use an IF node to catch these conditions:
{{ $json.State.ExitCode }} !== 0
||
{{ $json.State.Restarting }} === true
||
{{ $json.State.StartedAt }} > (Date.now() - 300000)
If any of these are true, the container is flagged as unhealthy, even if Portainer shows it as "running."
3. Pings healthchecks.io
For each monitored container, I send a ping to its unique healthchecks.io URL:
- If the container is healthy:
GET https://hc-ping.com/your-uuid - If the container failed:
GET https://hc-ping.com/your-uuid/fail
healthchecks.io expects these pings on a schedule. If I don't send one within the expected window, it alerts me via email or Telegram.
The key insight here: I'm not just pinging on success. I'm explicitly reporting failures, which forces healthchecks.io to track both execution and outcome.
What Worked
This setup caught several issues I wouldn't have noticed otherwise:
- A backup container that was restarting every 10 minutes due to a misconfigured volume mount
- A scraper that appeared "running" but had crashed during its last cron execution
- A data sync job that silently failed because an API key expired
In all three cases, Portainer showed the containers as healthy. The workflow caught them because it checked execution state, not just container state.
The 15-minute polling interval works well for my use case. It's frequent enough to catch issues quickly but not so aggressive that it hammers the Portainer API.
What Didn't Work
My first attempt used Portainer's webhook feature to trigger the workflow on container events. The problem: Portainer only fires webhooks on state changes (start, stop, restart). It doesn't fire them when a container is already running but failing internally.
I also tried using Portainer's built-in health checks, but those require modifying each container's Docker Compose file with a HEALTHCHECK instruction. That works, but it's not flexible enough for cron-based jobs that only run periodically.
Another issue: healthchecks.io's free tier has a limit on the number of checks. I had to be selective about which containers to monitor. I ended up prioritizing:
- Backup services
- Data sync jobs
- Critical scrapers
I'm not monitoring every container, just the ones where silent failure would actually matter.
Key Takeaways
- Container "health" is not the same as execution success. You need to check both.
- Polling Portainer's API is more reliable than relying on webhooks for cron-triggered services.
- healthchecks.io is simple but effective. The free tier is enough for most self-hosted setups.
- Label your containers in Docker Compose. It makes filtering in n8n much easier.
- Don't over-monitor. Focus on services where failure actually has consequences.
This workflow has been running for three months now. It's caught enough real issues that I trust it. I'm not checking Portainer manually anymore unless I get an alert.