Why I Built This Setup
I run several AI inference services on my Proxmox cluster—mostly LLM APIs and embedding models exposed over gRPC. The problem was simple: each service had its own port, no TLS by default, and managing certificates manually across multiple endpoints was tedious. I needed a single entry point that could:
- Handle TLS automatically
- Route requests to the right backend based on hostname or path
- Balance load across multiple inference workers when needed
- Not require constant certificate maintenance
I had used Nginx for years, but the gRPC configuration always felt brittle—lots of directives, manual HTTP/2 setup, and certificate renewal scripts. Caddy kept coming up in self-hosting circles, and the claim was that it "just works" with gRPC and automatic TLS. I decided to test that claim.
My Real Setup
Here's what I was working with:
- Backend services: Two LXC containers on Proxmox, each running a Python-based gRPC server (port 50051) for text generation
- Domain: A subdomain (ai.vipinpg.com) pointed to my home IP via Cloudflare DNS
- Reverse proxy host: A separate LXC container running Debian 12, where I installed Caddy
- Goal: External clients hit
ai.vipinpg.com, Caddy terminates TLS, forwards gRPC traffic to backends using h2c (HTTP/2 Cleartext), and balances load if both workers are healthy
I did not use Docker for this—I prefer LXC containers for infrastructure services because they're lighter and easier to snapshot on Proxmox. The principles are identical for Docker setups, but the file paths and systemd commands differ slightly.
Installing Caddy
Caddy isn't in Debian's default repos, so I used the official install script:
sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list
sudo apt update
sudo apt install caddy
This gave me Caddy 2.8.x. The systemd service started automatically, but I stopped it immediately because the default config listens on port 80 and serves a placeholder page—not what I needed.
Configuring the Caddyfile
Caddy's configuration lives in /etc/caddy/Caddyfile. The default file had some commented examples, which I deleted. Here's what I wrote for my first working gRPC proxy:
ai.vipinpg.com {
reverse_proxy 192.168.1.101:50051 192.168.1.102:50051 {
transport http {
versions h2c
}
lb_policy round_robin
health_uri /health
health_interval 10s
}
}
Breaking this down:
- ai.vipinpg.com: The hostname Caddy listens for. It automatically provisions a Let's Encrypt certificate for this domain.
- reverse_proxy: The directive that forwards traffic. I listed both backend IPs directly.
- transport http { versions h2c }: This tells Caddy to use HTTP/2 Cleartext when talking to backends. gRPC requires HTTP/2, but my backends don't do TLS internally—they expect plain h2c.
- lb_policy round_robin: Distributes requests evenly across both backends.
- health_uri and health_interval: Caddy checks
/healthon each backend every 10 seconds. If a backend fails, it's removed from rotation until it recovers.
I saved the file and validated it:
sudo caddy validate --config /etc/caddy/Caddyfile
No errors. I restarted Caddy:
sudo systemctl restart caddy
Within 30 seconds, Caddy had obtained a TLS certificate from Let's Encrypt. I checked the logs:
sudo journalctl -u caddy -f
I saw successful ACME challenge completion and certificate storage. The automatic TLS claim was real—I did nothing beyond pointing my DNS and writing three lines of config.
Testing with grpcurl
I use grpcurl to test gRPC endpoints. First, I verified the backend directly:
grpcurl -plaintext 192.168.1.101:50051 list
This returned the service definition. Then I tested through Caddy:
grpcurl ai.vipinpg.com:443 list
It worked. Caddy terminated TLS, converted the request to h2c, and forwarded it to the backend. I sent a few inference requests and confirmed both backends were being used (I logged which worker handled each request).
What Didn't Work Initially
Missing h2c transport: My first attempt didn't include the transport http { versions h2c } block. Caddy tried to connect to backends using HTTP/1.1, which gRPC doesn't support. Requests failed with malformed HTTP response errors. Adding the h2c directive fixed it immediately.
Health check endpoint: I assumed my gRPC service would respond to /health by default. It didn't. Caddy marked both backends as unhealthy and stopped routing traffic. I had to implement a basic HTTP health endpoint in my Python service that returned 200 OK. This took about 10 lines of code using Flask running on a separate thread.
Port confusion: I initially forgot that Caddy listens on 443 for HTTPS by default. My firewall rules only allowed 80 and 8080. External clients couldn't connect until I opened 443. Obvious in hindsight, but it cost me 20 minutes of confusion.
Load Balancing Behavior
Caddy's round-robin policy worked as expected. I sent 100 requests and logged which backend handled each one—the distribution was 50/50. When I stopped one backend, Caddy detected the failure within 10 seconds (the health check interval) and routed all traffic to the remaining worker. When I restarted the stopped backend, Caddy added it back to rotation automatically.
I did not test other load balancing policies (random, least_conn, ip_hash) because round-robin met my needs. The documentation suggests they work similarly, but I can't confirm from experience.
Adding a Second Service
I later added an embedding service on a different subdomain. The Caddyfile became:
ai.vipinpg.com {
reverse_proxy 192.168.1.101:50051 192.168.1.102:50051 {
transport http {
versions h2c
}
lb_policy round_robin
health_uri /health
health_interval 10s
}
}
embed.vipinpg.com {
reverse_proxy 192.168.1.103:50052 {
transport http {
versions h2c
}
}
}
Caddy provisioned a second certificate automatically. No additional configuration needed. This is where Caddy's simplicity shines—adding services is just adding blocks.
Certificate Management
Caddy stores certificates in /var/lib/caddy/.local/share/caddy/certificates. I checked this directory and found PEM files for both domains. Renewal happens automatically 30 days before expiration. I've been running this setup for four months now, and certificates have renewed twice without intervention.
I did not configure any certificate hooks or custom ACME servers. The defaults worked perfectly for my use case.
Limitations and Trade-offs
No gRPC reflection: Caddy proxies gRPC traffic but doesn't understand the protocol deeply. If your client relies on server reflection to discover methods, it won't work through Caddy. You need to provide the proto files to your client separately.
Health checks are HTTP-based: Caddy's health checks use HTTP, not gRPC. This means you need a separate HTTP endpoint for health monitoring. If your service is pure gRPC with no HTTP support, you'll need to add that capability or skip health checks.
No built-in rate limiting for gRPC: Caddy has rate limiting plugins for HTTP, but they don't apply cleanly to gRPC. If you need per-client rate limiting, you'll have to implement it in your backend service.
Logging is basic: Caddy logs successful proxying but doesn't log gRPC method names or status codes by default. If you need detailed request logging, you'll need to configure structured logging or add middleware.
Performance Notes
I ran informal load tests using ghz (a gRPC benchmarking tool). With Caddy in the middle, I saw about 5-8% latency overhead compared to hitting backends directly. This is acceptable for my use case—most of the time is spent in inference anyway.
CPU usage on the Caddy container stayed under 10% during normal load (around 50 requests per second). Memory usage was stable at ~40MB. Caddy is efficient, but I didn't stress-test it beyond a few hundred concurrent connections.
Why This Worked for Me
Caddy solved my certificate management problem completely. I went from manually renewing certs every 90 days to zero maintenance. The Caddyfile syntax is clear enough that I can modify it months later without re-reading documentation.
The automatic h2c handling for gRPC was the other big win. Nginx required explicit grpc_pass directives and careful HTTP/2 configuration. Caddy just worked once I specified the transport.
Load balancing with health checks gave me basic high availability without needing a separate tool like HAProxy. For small-scale self-hosted setups, this is enough.
Key Takeaways
- Caddy's automatic TLS is not marketing—it genuinely works without intervention.
- gRPC proxying requires the
h2ctransport directive. Don't skip this. - Health checks need HTTP endpoints. If your service is pure gRPC, add a basic HTTP server.
- The Caddyfile format is readable and easy to version control. Changes are low-risk.
- Caddy is not a full observability solution. You'll still need backend logging for detailed request tracking.
This setup has been running in production for my personal AI services since September 2024. It's required zero maintenance beyond updating Caddy once when a security patch was released. For self-hosted gRPC APIs, it's the simplest reverse proxy I've used.