Implementing cron-based battery health monitoring for UPS systems with nut server and ntfy critical alerts

Why I Built a Cron-Based UPS Monitor

I run a small homelab with Proxmox hosts, a Synology NAS, and a few other Linux systems. All of them sit behind a UPS connected to a central NUT server running in Docker. The theory was simple: when the battery runs low, each machine should shut down gracefully.

In practice, Synology’s built-in UPS client would hang during shutdown. The system would enter some kind of “safe mode” and become completely unresponsive. The only way to recover was a hard power cycle, which is exactly what a UPS is supposed to prevent. Proxmox had similar issues with timing and state synchronization.

I needed something that would actually work, so I wrote a simple Bash script that runs every minute via cron. It checks the UPS status directly from the NUT server’s REST API and shuts down the system if the battery is critically low. No daemons, no complex state machines, just a flag file and the standard shutdown command.

My Setup

I have a single UPS connected to a dedicated Linux VM running NUT server inside Docker. This server exposes a REST API on port 5000 that provides the current UPS status.

Every other machine in my network runs the monitoring script as a cron job under root. The script queries the API every minute and makes a simple decision:

If the UPS is on battery and low (OB LB), start a countdown
If power returns before the countdown ends, cancel the shutdown
If the countdown completes, execute /sbin/shutdown -h now

I set the countdown to 5 minutes because my UPS can hold the load for about 15 minutes, and I wanted to give short power blips a chance to resolve without triggering shutdowns.

The State File

The script uses a flag file at /tmp/ups_shutdown_pending.flag to track whether a shutdown is pending. When the low battery condition is first detected, the script writes the current Unix timestamp into this file. On subsequent runs, it reads the timestamp and calculates how much time has elapsed.

If power is restored, the script simply deletes the flag file. No complicated cleanup, no leftover processes.

Configuration

All settings live in a file called ups.env that sits next to the script. Mine looks like this:

API_SERVER_URI="http://192.168.1.50:5000"
API_TOKEN="my_secret_token"
SHUTDOWN_DELAY_MINUTES=5

The API token must match what’s configured on the NUT server. The script uses it to authenticate when fetching UPS status.

What Worked

The cron-based approach turned out to be far more reliable than any built-in UPS client I’ve used. Running every minute provides enough granularity without creating noticeable load. The script itself takes less than a second to execute.

Using /sbin/shutdown -h now directly avoids all the complexity of NUT’s native shutdown coordination. There’s no waiting for other systems, no network timeouts, no dependency on daemon health. The kernel’s shutdown path is well-tested and predictable.

The flag file approach is stateless in the sense that the script doesn’t need to remember anything between runs except what’s in that one file. If the script crashes or is killed, the next run picks up exactly where it left off by reading the timestamp.

Hub Mode

I later added what I call “hub mode” because managing ups.env files across multiple machines became tedious. In hub mode, the script fetches its configuration from the central NUT server’s API instead of reading it from the local file.

The server returns the shutdown delay and any other settings as JSON. The script parses this with jq and caches it back to the local ups.env file. If the API is unreachable, the script falls back to the cached values.

This means I can change the shutdown delay for all clients by updating a single config file on the server, and each client will pick up the new value on its next run.

Status Reporting

The script also reports its current state back to the server API on every run. It sends either online or shutdown_pending along with a timestamp. This gives me a live dashboard showing which machines are running normally and which are in countdown mode.

The reporting happens after the main logic, so even if the API call fails, the shutdown decision is unaffected.

What Didn’t Work

Initial API Design

My first version of the API returned the raw output of upsc as plain text. I thought this would be simpler than JSON, but parsing it reliably in Bash turned out to be messy. I switched to a proper JSON response and the code became much cleaner.

Shutdown Coordination

I initially tried to implement a feature where the script would check if other machines were still online before shutting down. The idea was to avoid shutting down the NUT server before its clients.

This added a lot of complexity and created race conditions. If two machines checked each other’s status at the same time, both might delay shutdown indefinitely. I removed this logic and instead rely on setting different shutdown delays for different machines. The NUT server has a 10-minute delay while clients have 5 minutes.

Synology Paths

Synology DSM doesn’t have an /opt directory, and scripts placed in system directories get wiped during updates. I had to document that Synology users should put the script on a data volume like /volume1/scripts.

I also learned that Synology’s cron doesn’t use the standard crontab command. You have to configure scheduled tasks through the web UI under Task Scheduler, which is fine but not obvious.

Alerting with ntfy

I wanted push notifications when a shutdown was triggered, so I added ntfy integration. The script sends a critical priority message to my ntfy topic whenever it detects low battery or cancels a pending shutdown.

The relevant code looks like this:

if [ -n "$NTFY_TOPIC" ]; then
    curl -H "Priority: urgent" -d "UPS low battery detected on $(hostname)" "$NTFY_TOPIC"
fi

I set NTFY_TOPIC in ups.env to point to my self-hosted ntfy server. The priority is set to urgent so the notification bypasses Do Not Disturb on my phone.

This has been useful a few times when the power went out while I was away from home. I got the alert, checked the UPS dashboard remotely, and saw that everything was shutting down cleanly.

Alert Fatigue

One issue I ran into was getting duplicate alerts. If the script runs every minute and the shutdown countdown is 5 minutes, I’d get 5 identical notifications.

I fixed this by only sending the alert once when the flag file is first created. The script checks if the file already exists before sending the notification:

if [ ! -f "$FLAG_FILE" ]; then
    echo "$CURRENT_TIME" > "$FLAG_FILE"
    # Send alert here
fi

Now I get exactly one alert when the countdown starts and another when power is restored.

Battery Health Monitoring

The script only cares about OB LB status, which is the critical “shut down now” signal. But I also wanted to track battery health over time to know when the UPS itself was failing.

NUT exposes a battery.charge value that shows the current charge percentage. I added a separate cron job that logs this value to a file once per hour:

0 * * * * curl -s http://192.168.1.50:5000/upsc | jq -r '.["battery.charge"]' >> /var/log/ups_battery.log

I then set up a simple script to parse this log and alert me if the charge drops below 90% while on mains power. This would indicate the battery is no longer holding a full charge.

This approach is basic but effective. I could feed the data into Grafana or something similar, but for now a text log and a threshold check is enough.

Key Takeaways

Cron-based monitoring is more reliable than daemon-based clients for simple tasks like UPS shutdown
Using shutdown -h now directly avoids the complexity and bugs in vendor-specific shutdown logic
A single flag file with a timestamp is sufficient for tracking shutdown state
Centralizing configuration through an API reduces maintenance when managing multiple machines
Fallback to cached config ensures the system stays protected even if the API server is down
ntfy provides a simple way to get critical alerts without setting up email or SMS
Logging battery charge separately from shutdown logic helps identify failing batteries before they cause problems

The entire script is about 150 lines of Bash. It has no dependencies beyond curl, jq, and the standard Unix tools. It runs on Proxmox, Synology, Debian, and Ubuntu without modification.

I’ve been using this setup for over a year now across five machines. It has triggered real shutdowns during two extended power outages and has never failed to either shut down cleanly or cancel properly when power returned.

Tech Expert & Vibe Coder

Why I Built a Cron-Based UPS Monitor

My Setup

The State File

Configuration

What Worked

Hub Mode

Status Reporting

What Didn’t Work

Initial API Design

Shutdown Coordination

Synology Paths

Alerting with ntfy

Alert Fatigue

Battery Health Monitoring

Key Takeaways

Category:

Setting Up Automated Home...

Implementing Prometheus...

Leave a Comment Cancel reply

Categories

Related Posts

Setting Up Automated Home Assistant Backup...

Implementing Prometheus Alerting for Silent Cron...

Building N8n Workflows for Cross-platform Package...

About Me

Vipin PG

Tech Expert & Vibe Coder

Implementing cron-based battery health monitoring for UPS systems with nut server and ntfy critical alerts

Why I Built a Cron-Based UPS Monitor

My Setup

The State File

Configuration

What Worked

Hub Mode

Status Reporting

What Didn’t Work

Initial API Design

Shutdown Coordination

Synology Paths

Alerting with ntfy

Alert Fatigue

Battery Health Monitoring

Key Takeaways

Category:

Setting Up Automated Home...

Implementing Prometheus...

Leave a Comment Cancel reply

Subscribe to Newsletter

Categories

Related Posts

Setting Up Automated Home Assistant Backup...

Implementing Prometheus Alerting for Silent Cron...

Building N8n Workflows for Cross-platform Package...

About Me

Vipin PG