Building Automated Container Security Scanning Pipelines: Integrating Trivy with Portainer Webhooks for Pre-Deployment Vulnerability Blocking

Why I Built This

I run a Proxmox cluster with multiple Docker hosts managed through Portainer. Over time, I’ve deployed dozens of containers—some from trusted sources, others from community repositories I barely vetted. The problem hit me when a security advisory came out for a base image I’d been using in production for months. I had no systematic way to know what vulnerabilities existed in my running containers until something broke or made the news.

I needed a way to scan images before they went live, not after. Manual scanning before every deployment wasn’t realistic—I update containers frequently, test new tools, and run automated workflows through n8n. The solution had to integrate with my existing Portainer setup and block deployments automatically if critical vulnerabilities were found.

Building Automated Container Security Scanning Pipelines: Integrating Trivy with Portainer Webhooks for Pre-Deployment Vulnerability Blocking

My Setup

I’m using:

Portainer Business (self-hosted) managing 4 Docker environments across my Proxmox cluster
Trivy installed on a dedicated Debian 12 VM with 4GB RAM
n8n for webhook handling and orchestration
A simple Flask API I wrote to act as the scanning gateway
PostgreSQL database to track scan results over time

Portainer has webhook functionality that fires on specific events. I configured webhooks to trigger before container creation and updates. The webhook payload includes image details, which I extract and pass to Trivy for scanning.

How I Built the Pipeline

Installing and Configuring Trivy

I installed Trivy directly on the VM rather than running it in a container. This gave me more control over caching and database updates:

wget https://github.com/aquasecurity/trivy/releases/download/v0.48.3/trivy_0.48.3_Linux-64bit.deb
sudo dpkg -i trivy_0.48.3_Linux-64bit.deb

I configured Trivy to use a local cache directory and set up a daily cron job to update the vulnerability database:

0 2 * * * /usr/local/bin/trivy image --download-db-only

This runs at 2 AM and keeps the database current without blocking scans during the day.

Building the Scanning API

I wrote a lightweight Flask API to receive webhook calls from Portainer and execute Trivy scans. The API does three things:

Receives the webhook payload
Extracts the image name and tag
Runs Trivy with specific severity filters
Returns a pass/fail response based on findings

Here’s the core scanning function I use:

import subprocess
import json

def scan_image(image_name):
    cmd = [
        'trivy', 'image',
        '--severity', 'CRITICAL,HIGH',
        '--format', 'json',
        '--quiet',
        image_name
    ]
    
    result = subprocess.run(cmd, capture_output=True, text=True)
    scan_data = json.loads(result.stdout)
    
    critical_count = 0
    high_count = 0
    
    for target in scan_data.get('Results', []):
        for vuln in target.get('Vulnerabilities', []):
            if vuln['Severity'] == 'CRITICAL':
                critical_count += 1
            elif vuln['Severity'] == 'HIGH':
                high_count += 1
    
    return {
        'critical': critical_count,
        'high': high_count,
        'passed': critical_count == 0
    }

I initially blocked on any HIGH severity findings, but that was too aggressive. Many images have HIGH vulnerabilities that aren’t actually exploitable in my use case. I settled on blocking only CRITICAL vulnerabilities and logging HIGH ones for manual review.

Configuring Portainer Webhooks

Portainer’s webhook system is straightforward but limited. I set up webhooks at the environment level, not per-stack. This means every container creation or update in that environment triggers the webhook.

The webhook configuration in Portainer:

Event: Container creation and container update
Endpoint URL: My Flask API endpoint
Method: POST

The payload Portainer sends includes the image name, but not always in a consistent format. I had to handle variations like:

nginx:latest
docker.io/library/nginx:latest
registry.example.com/custom-image:v1.2.3

My API normalizes these before passing them to Trivy.

Blocking Deployments

This is where I hit a limitation: Portainer webhooks are notification-only. They fire after an action starts, not before it completes. I cannot actually prevent a container from starting through the webhook alone.

My workaround:

The webhook fires when a container is created
My API scans the image
If the scan fails (CRITICAL vulnerabilities found), the API calls Portainer’s REST API to immediately stop and remove the container
I log the event and send a notification through n8n

This means there’s a brief window (usually 2-5 seconds) where a vulnerable container is technically running. For my use case, this is acceptable because these containers aren’t exposed externally during startup, and my network is segmented.

Handling Private Registries

I use a private registry for some custom images. Trivy needs credentials to scan these. I configured Trivy to use Docker’s credential store:

docker login registry.example.com

Trivy automatically picks up the credentials from ~/.docker/config.json. This works, but I had to ensure the user running the Flask API has access to this file.

What Didn’t Work

Scanning Large Images

Some of my images are 2-3GB. Trivy takes 30-60 seconds to scan these, which causes webhook timeouts. Portainer expects a response within 10 seconds.

I solved this by making the API respond immediately with “scan in progress” and performing the actual scan asynchronously. If the scan fails, I stop the container afterward. This isn’t perfect, but it’s the best I could do without modifying Portainer’s timeout settings (which aren’t exposed in the UI).

False Positives

Trivy reports vulnerabilities based on package versions, not actual exploitability. I’ve had scans flag vulnerabilities in base OS packages that are never executed by my application.

I added a suppression list in my API where I can manually mark specific CVEs as acceptable for certain images. This is not ideal—it requires manual curation—but it’s better than ignoring all HIGH severity findings.

Database Bloat

I log every scan result to PostgreSQL for historical tracking. After a few weeks, the database grew to several GB because I was storing full JSON scan outputs. I changed to storing only summary data (counts by severity) and keep full results for failed scans only.

Performance Considerations

Trivy caches image layers, which significantly speeds up repeat scans. The first scan of an image takes 15-30 seconds. Subsequent scans of the same image (even with updated vulnerability data) take 2-5 seconds.

I allocated 4GB RAM to the scanning VM. Trivy can use more during scans of large images, but 4GB has been sufficient for my workload. CPU usage spikes briefly during scans but averages under 10%.

Integration with n8n

When a scan fails, my API sends a webhook to n8n, which:

Logs the failure to a dedicated Slack channel
Creates a ticket in my self-hosted Vikunja instance
Sends me a Pushover notification if it’s a production environment

This gives me visibility into what’s being blocked without having to constantly check logs.

Limitations I’m Living With

This setup is not a true pre-deployment gate. There’s always a brief moment where a vulnerable container starts before being killed. For truly critical environments, this wouldn’t be acceptable. I’d need to implement scanning at the CI/CD level or use Portainer’s GitOps features with a custom admission controller.

I also don’t scan running containers continuously. Trivy can do this, but I haven’t integrated it yet. My pipeline only scans at deployment time. If a new vulnerability is discovered in an image I deployed last month, I won’t know unless I manually trigger a rescan or redeploy.

The suppression list requires manual maintenance. Every time I suppress a CVE, I document why in a separate markdown file, but this is manual work that doesn’t scale well.

What I Learned

Portainer’s webhook system is useful but limited. It’s designed for notifications, not enforcement. If I were building this again, I’d look into Portainer’s API-first deployment approach or move to a GitOps model where scanning happens before the deployment request even reaches Portainer.

Trivy is fast and accurate, but interpreting its results requires context. Not every vulnerability matters in every deployment. I spent more time building the suppression and filtering logic than I did integrating Trivy itself.

The biggest value isn’t in blocking every vulnerable image—it’s in having visibility. Before this pipeline, I had no idea what was running in my containers. Now I have a record of every image deployed, its vulnerabilities at deployment time, and a process to review findings before they go live.

This system has caught several images with known RCE vulnerabilities that I would have deployed without thinking twice. It’s not perfect, but it’s significantly better than deploying blind.

Tech Expert & Vibe Coder

Building Automated Container Security Scanning Pipelines: Integrating Trivy with Portainer Webhooks for Pre-Deployment Vulnerability Blocking

Why I Built This

My Setup

How I Built the Pipeline

Installing and Configuring Trivy

Building the Scanning API

Configuring Portainer Webhooks

Blocking Deployments

Handling Private Registries

What Didn’t Work

Scanning Large Images

False Positives

Database Bloat

Performance Considerations

Integration with n8n

Limitations I’m Living With

What I Learned

Leave a Comment Cancel reply

Search Articles

Categories

About the Author

Vipin PG

Tech Expert & Vibe Coder

Why I Built This

My Setup

How I Built the Pipeline

Installing and Configuring Trivy

Building the Scanning API

Configuring Portainer Webhooks

Blocking Deployments

Handling Private Registries

What Didn’t Work

Scanning Large Images

False Positives

Database Bloat

Performance Considerations

Integration with n8n

Limitations I’m Living With

What I Learned

Debugging Docker Compose Health Check Failures in Multi-Architecture Deployments: ARM64 vs AMD64 Timing Differences and Timeout Tuning

Fixing Docker Bridge Network MTU Mismatches in Nested Virtualization: Resolving Packet Loss Between Proxmox VMs and Container Networks

Leave a Comment Cancel reply

Search Articles

Categories

About the Author

Vipin PG

Related articles

Debugging Docker Compose Healthcheck Failures with Fish Shell Flag Explainers: ...

Building Automated Container Rollback Pipelines: Using Docker Compose Watch...

Debugging Container Timezone and Locale Inconsistencies Across Multi-region...

Get new posts and practical tech notes in your inbox.