Setting up automated model updates for Ollama using systemd timers and custom bash scripts to pull quantized GGUF files from Hugging Face

Why I Built This System

I run Ollama on my home server to handle various AI tasks—document analysis, code review, automated responses. The problem wasn’t Ollama itself, which works reliably. The problem was keeping models current without manual intervention.

Every few weeks, new quantized GGUF models appear on Hugging Face that offer better performance or reduced memory usage. Manually checking for updates, downloading files, and updating Ollama felt like wasted effort. I needed automation that ran on its own schedule and handled the entire update cycle.

I also wanted control over which models I used. Ollama’s built-in pull system works fine for standard models, but I prefer specific quantizations from Hugging Face—often the Q4_K_M or Q5_K_M variants that balance quality and resource usage for my hardware.

My Actual Setup

I run Ollama in a Docker container on Proxmox, but the automation scripts run directly on the Proxmox host. This separation keeps the update logic independent of the container lifecycle.

The setup uses:

A bash script that downloads GGUF files from Hugging Face using their API
Another script that converts downloaded GGUFs into Ollama modelfiles and loads them
A systemd timer that triggers these scripts on a schedule I control
Simple logging to track what succeeded or failed

I store everything in /opt/ollama-updates/ on the host. Downloaded models go into a staging directory, and the scripts move them into Ollama’s volume mount only after verification.

The Download Script

The first script handles downloading from Hugging Face. I wrote it to pull specific model files by repository and filename, not entire repositories.

#!/bin/bash
# /opt/ollama-updates/scripts/download-model.sh

MODEL_REPO="$1"
MODEL_FILE="$2"
STAGING_DIR="/opt/ollama-updates/staging"
LOG_FILE="/opt/ollama-updates/logs/download.log"

mkdir -p "$STAGING_DIR"
mkdir -p "$(dirname "$LOG_FILE")"

echo "[$(date)] Starting download: $MODEL_REPO/$MODEL_FILE" >> "$LOG_FILE"

# Download using Hugging Face API
wget -q --show-progress 
  -O "$STAGING_DIR/$MODEL_FILE" 
  "https://huggingface.co/$MODEL_REPO/resolve/main/$MODEL_FILE"

if [ $? -eq 0 ]; then
  echo "[$(date)] Download complete: $MODEL_FILE" >> "$LOG_FILE"
  exit 0
else
  echo "[$(date)] Download failed: $MODEL_FILE" >> "$LOG_FILE"
  exit 1
fi

This script takes two arguments: the repository path and the specific GGUF filename. I don’t download everything—just the quantization I want. If the download fails, the script exits with an error code that the systemd timer can catch.

I use wget instead of curl because it handles resume better if a download gets interrupted. Hugging Face files can be large, and my connection occasionally drops.

The Model Import Script

After downloading, a second script converts the GGUF file into an Ollama model and imports it.

#!/bin/bash
# /opt/ollama-updates/scripts/import-model.sh

MODEL_FILE="$1"
MODEL_NAME="$2"
STAGING_DIR="/opt/ollama-updates/staging"
LOG_FILE="/opt/ollama-updates/logs/import.log"

if [ ! -f "$STAGING_DIR/$MODEL_FILE" ]; then
  echo "[$(date)] Model file not found: $MODEL_FILE" >> "$LOG_FILE"
  exit 1
fi

echo "[$(date)] Importing model: $MODEL_NAME from $MODEL_FILE" >> "$LOG_FILE"

# Create modelfile
cat > "$STAGING_DIR/Modelfile" <> "$LOG_FILE"
  rm "$STAGING_DIR/$MODEL_FILE"
  rm "$STAGING_DIR/Modelfile"
  exit 0
else
  echo "[$(date)] Import failed: $MODEL_NAME" >> "$LOG_FILE"
  exit 1
fi

This script creates a minimal Modelfile pointing to the downloaded GGUF, then uses Ollama’s create command to import it. The docker exec call runs the command inside the Ollama container.

After successful import, I delete the staging files to save disk space. If import fails, I leave the files in place so I can debug manually.

The Orchestration Script

A third script ties everything together and defines which models to update.

#!/bin/bash
# /opt/ollama-updates/scripts/update-models.sh

LOG_FILE="/opt/ollama-updates/logs/update.log"
SCRIPT_DIR="/opt/ollama-updates/scripts"

echo "[$(date)] Starting model update cycle" >> "$LOG_FILE"

# Define models to update
declare -A MODELS=(
  ["mistral-7b-instruct"]="TheBloke/Mistral-7B-Instruct-v0.2-GGUF:mistral-7b-instruct-v0.2.Q4_K_M.gguf"
  ["codellama-13b"]="TheBloke/CodeLlama-13B-Instruct-GGUF:codellama-13b-instruct.Q5_K_M.gguf"
)

for MODEL_NAME in "${!MODELS[@]}"; do
  IFS=':' read -r REPO FILE <<> "$LOG_FILE"
  
  # Download
  "$SCRIPT_DIR/download-model.sh" "$REPO" "$FILE"
  if [ $? -ne 0 ]; then
    echo "[$(date)] Skipping import due to download failure" >> "$LOG_FILE"
    continue
  fi
  
  # Import
  "$SCRIPT_DIR/import-model.sh" "$FILE" "$MODEL_NAME"
done

echo "[$(date)] Update cycle complete" >> "$LOG_FILE"

I use a bash associative array to map model names to their Hugging Face locations. This makes it easy to add or remove models without changing the core logic.

The script runs each download and import in sequence. If a download fails, it skips the import for that model but continues with the rest. This prevents one broken model from blocking updates to others.

The Systemd Timer

I created a systemd service and timer to run the orchestration script weekly.

# /etc/systemd/system/ollama-update.service
[Unit]
Description=Ollama Model Update Service
After=network-online.target docker.service
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/opt/ollama-updates/scripts/update-models.sh
User=root
StandardOutput=journal
StandardError=journal

# /etc/systemd/system/ollama-update.timer
[Unit]
Description=Weekly Ollama Model Update Timer
Requires=ollama-update.service

[Timer]
OnCalendar=Sun 03:00
Persistent=true

[Install]
WantedBy=timers.target

The timer runs every Sunday at 3 AM. I chose that time because server load is lowest then, and I can check logs on Sunday morning if something breaks.

The Persistent=true setting ensures that if the server is down during the scheduled time, the update runs as soon as it boots back up.

To enable the timer:

systemctl daemon-reload
systemctl enable ollama-update.timer
systemctl start ollama-update.timer

What Worked

The system has run reliably for several months. Models update automatically, and I only check logs when I remember to look.

Using separate scripts for download and import made debugging easier. When a model failed to import, I could run just the import script manually without re-downloading.

Logging everything to plain text files worked better than I expected. I can grep for errors or check the last few runs with tail. No need for complex logging infrastructure.

The associative array approach for defining models keeps the configuration readable. Adding a new model takes one line.

What Didn’t Work

My first version tried to check Hugging Face for new model versions before downloading. I wrote code to compare timestamps and only pull updates if files changed. This added complexity and broke frequently when Hugging Face’s API responses changed format.

I removed the version checking. Now the script just downloads and imports every time. If the model hasn’t changed, Ollama recognizes the duplicate and skips it. Simpler and more reliable.

I initially ran the scripts as a non-root user, but Docker permissions caused problems. The user couldn’t execute commands inside the Ollama container without adding it to the docker group, which felt like a security risk. Running as root was simpler and matched how I manage other Proxmox automation.

Error handling took several iterations. Early versions would fail silently if a download stalled. I added explicit exit codes and log messages to make failures visible.

Key Takeaways

Systemd timers work well for scheduled tasks that don’t need complex orchestration. They’re built into the system and don’t require additional dependencies.

Keeping scripts focused on single tasks—download, import, orchestrate—made the system easier to maintain. When something breaks, I know which script to check.

Downloading specific GGUF files instead of entire model repositories saves bandwidth and disk space. I only get what I need.

Plain text logs are sufficient for this kind of automation. I don’t need structured logging or dashboards for a task that runs once a week.

The system isn’t perfect. If Hugging Face changes their file structure or Ollama updates its import process, I’ll need to adjust the scripts. But for now, it handles model updates without requiring my attention, which was the goal.

Tech Expert & Vibe Coder

Why I Built This System

My Actual Setup

The Download Script

The Model Import Script

The Orchestration Script

The Systemd Timer

What Worked

What Didn’t Work

Key Takeaways

Category:

Implementing automatic model...

Setting up hybrid inference...

Leave a Comment Cancel reply

Categories

Related Posts

Implementing automatic model selection based on...

Setting up hybrid inference pipelines: routing...

Debugging token generation slowdowns in LM Studio...

About Me

Vipin PG

Tech Expert & Vibe Coder

Setting up automated model updates for Ollama using systemd timers and custom bash scripts to pull quantized GGUF files from Hugging Face

Why I Built This System

My Actual Setup

The Download Script

The Model Import Script

The Orchestration Script

The Systemd Timer

What Worked

What Didn’t Work

Key Takeaways

Category:

Implementing automatic model...

Setting up hybrid inference...

Leave a Comment Cancel reply

Subscribe to Newsletter

Categories

Related Posts

Implementing automatic model selection based on...

Setting up hybrid inference pipelines: routing...

Debugging token generation slowdowns in LM Studio...

About Me

Vipin PG