Tech Expert & Vibe Coder

With 14+ years of experience, I specialize in self-hosting, AI automation, and Vibe Coding – building applications using AI-powered tools like Google Antigravity, Dyad, and Cline. From homelabs to enterprise solutions.

Monitoring n8n workflow execution costs: tracking self-hosted resource usage vs saas api pricing with prometheus and grafana

Why I Started Tracking n8n Costs

I run n8n self-hosted on Proxmox. Most of my workflows call external APIs—OpenAI, Anthropic, Perplexity, sometimes Google's models. Every execution costs money, but I had no visibility into how much.

The problem wasn't the total monthly bill from OpenAI or Anthropic. I could see that in their dashboards. The problem was understanding which workflows were expensive, which nodes burned through tokens, and whether self-hosting was actually cheaper than using n8n Cloud with built-in AI credits.

I needed to answer:

  • What does each workflow execution actually cost me in API calls?
  • Are certain nodes wasteful with tokens?
  • Would n8n Cloud's pricing be cheaper than my infrastructure + API costs?
  • How do I catch runaway costs before they happen?

n8n doesn't expose this data natively. You can see execution counts, but not token usage or API costs per workflow. So I built my own monitoring setup using Prometheus and Grafana.

My Setup: Self-Hosted n8n on Proxmox

I run n8n in a Docker container on Proxmox. It connects to a PostgreSQL database (also containerized) and uses Redis for queue mode. The whole stack sits on a dedicated VM with 4 CPU cores and 8GB RAM.

My workflows primarily use:

  • OpenAI nodes for GPT-4 and GPT-3.5-turbo
  • Anthropic nodes for Claude models
  • HTTP Request nodes to call Perplexity and other APIs
  • Code nodes for custom token counting when needed

Most workflows run on webhooks or schedules. Some are triggered by other workflows. Token usage varies wildly—a simple summarization might use 500 tokens, while a complex analysis can hit 10,000+.

What I Actually Wanted to Track

I didn't need perfect accounting. I needed useful visibility:

  • Cost per workflow execution (total API spend)
  • Token usage by node (which steps are expensive)
  • Model-specific costs (GPT-4 vs Claude vs cheaper alternatives)
  • Daily/weekly trends (catch unexpected spikes)
  • Self-hosted vs Cloud comparison (infrastructure + API costs vs n8n's all-in pricing)

The goal wasn't to optimize every penny. It was to avoid surprises and make informed decisions about which models to use where.

How I Built the Monitoring System

Step 1: Extracting Token Data from n8n

n8n doesn't expose token usage in its execution logs by default. But many AI nodes return token counts in their output data. For example, OpenAI nodes include:

{
  "usage": {
    "prompt_tokens": 245,
    "completion_tokens": 892,
    "total_tokens": 1137
  }
}

I created a dedicated monitoring workflow that:

  1. Gets triggered at the end of each AI-heavy workflow
  2. Receives the execution ID as input
  3. Uses n8n's API to fetch full execution data
  4. Extracts token counts from AI node outputs
  5. Calculates costs based on model pricing
  6. Writes metrics to a PostgreSQL table

This required enabling "Return Intermediate Steps" in agent configurations so token data wasn't stripped from the execution payload.

Step 2: Calculating Costs

I maintain a simple pricing table in the monitoring workflow:

const MODEL_PRICING = {
  'gpt-4': { input: 0.03, output: 0.06 },  // per 1K tokens
  'gpt-3.5-turbo': { input: 0.0015, output: 0.002 },
  'claude-3-opus': { input: 0.015, output: 0.075 },
  'claude-3-sonnet': { input: 0.003, output: 0.015 }
};

For each AI node execution, I calculate:

const promptCost = (promptTokens / 1000) * MODEL_PRICING[model].input;
const completionCost = (completionTokens / 1000) * MODEL_PRICING[model].output;
const totalCost = promptCost + completionCost;

This gets stored with workflow name, node name, timestamp, and model used.

Step 3: Exposing Metrics to Prometheus

I run a small Node.js service (also containerized) that:

  1. Queries the cost tracking table every 60 seconds
  2. Exposes metrics in Prometheus format on /metrics
  3. Runs on port 9090 inside my Proxmox network

The metrics include:

# HELP n8n_workflow_cost_usd Total cost in USD per workflow
# TYPE n8n_workflow_cost_usd gauge
n8n_workflow_cost_usd{workflow="content_analyzer",model="gpt-4"} 0.42

# HELP n8n_token_usage Total tokens used
# TYPE n8n_token_usage counter
n8n_token_usage{workflow="content_analyzer",type="prompt"} 8420
n8n_token_usage{workflow="content_analyzer",type="completion"} 3210

Prometheus scrapes this endpoint every minute. I didn't use the official n8n metrics endpoint because it doesn't include API cost data.

Step 4: Building Grafana Dashboards

I created two main dashboards:

Dashboard 1: Workflow Cost Overview

  • Total daily/weekly/monthly API spend
  • Cost breakdown by workflow
  • Cost breakdown by model
  • Top 10 most expensive executions

Dashboard 2: Self-Hosted vs Cloud Comparison

  • My actual costs: infrastructure ($40/month VPS) + API costs
  • Estimated n8n Cloud costs based on execution count
  • Break-even analysis

The Cloud comparison uses n8n's pricing tiers (I track my execution count separately) and adds my API costs on top, since Cloud doesn't include external API credits.

What Actually Worked

The monitoring workflow is reliable. It runs after every AI-heavy workflow without failures. Passing the execution ID and fetching data via n8n's API works consistently.

Cost visibility changed my behavior. I discovered that one workflow was using GPT-4 for simple tasks that GPT-3.5-turbo could handle. Switching that one workflow saved about $30/month.

Grafana alerts prevent surprises. I set an alert for daily costs exceeding $5. It triggered once when a workflow got stuck in a loop, burning through API calls. Caught it within 20 minutes.

The self-hosted vs Cloud comparison is clear. At my current usage (~15,000 executions/month, ~$80/month in API costs), self-hosting costs me about $120/month total. n8n Cloud Pro would be $50/month for executions, but I'd still pay the $80 in API costs, totaling $130. The savings are marginal, but self-hosting gives me queue mode and no execution limits.

What Didn't Work

Real-time metrics are impossible. n8n doesn't expose token data until after execution completes. You can't monitor costs during a workflow run. My monitoring always lags by at least one execution cycle.

Manual price updates are tedious. Every time OpenAI or Anthropic changes pricing, I have to update my table manually. I haven't automated this because pricing changes are infrequent enough that it's not worth the effort.

Non-AI costs are invisible. This system only tracks LLM API costs. It doesn't account for other external APIs I call (web scraping services, data enrichment tools, etc.). Those are separate line items I track manually.

The metrics service is fragile. If my Node.js metrics exporter crashes, Prometheus gets stale data. I should add proper health checks and restart logic, but I haven't prioritized it because failures are rare.

Key Lessons from Running This

Token counting only matters if you use it. Just collecting metrics doesn't save money. You have to review the data regularly and make changes based on what you find.

Model choice has the biggest cost impact. Switching from GPT-4 to Claude Sonnet for certain tasks cut costs by 60% with minimal quality loss. Switching to GPT-3.5-turbo where possible cut costs by 90%.

Self-hosting is cheaper at scale, but not by much. If you're running under 10,000 executions/month with low API usage, n8n Cloud's convenience might be worth the small premium. Above that, self-hosting wins clearly.

Execution count is a poor cost predictor. A workflow with 50 nodes running simple operations costs almost nothing. A workflow with 5 nodes calling GPT-4 can cost dollars per run. Execution-based pricing (like n8n Cloud uses) doesn't reflect actual resource consumption.

Infrastructure costs are fixed, API costs scale. My $40/month VPS handles 10,000 or 50,000 executions equally well. But API costs scale linearly with usage. That's the real variable to watch.

When This Monitoring Setup Makes Sense

This approach works if:

  • You run n8n self-hosted with significant AI/LLM usage
  • You want cost visibility without paying for n8n Cloud
  • You already run Prometheus and Grafana (or are willing to set them up)
  • You're comfortable writing custom workflows and small services
  • You care about optimizing API costs over time

It doesn't make sense if:

  • You're on n8n Cloud (they provide execution metrics already)
  • Your API costs are negligible (under $20/month)
  • You don't want to maintain custom monitoring infrastructure
  • You rarely use AI nodes

What I'd Do Differently

If I were starting over, I'd:

  • Automate price updates. Scrape OpenAI and Anthropic pricing pages weekly and update the table automatically.
  • Add health monitoring for the metrics service. Use Proxmox's built-in monitoring to restart the container if it crashes.
  • Track non-LLM API costs too. Build a more comprehensive cost tracking system that includes all external API calls, not just AI models.
  • Set up budget alerts. Use Grafana to send alerts when monthly costs are trending toward a predefined limit.

But honestly, the current setup works well enough that I haven't prioritized these improvements. The core monitoring does what I need.