Why I Needed Network Segmentation for AI Workloads
I run Ollama and LM Studio on my home network for local AI processing. Both tools listen on network ports and accept requests from any device that can reach them. This worked fine when I was the only user, but the moment I started running automated workflows through n8n and allowing family members to access AI tools from their devices, I realized I had a problem.
My AI servers were sitting on the same network segment as everything else—IoT devices, guest WiFi, security cameras. If someone compromised a smart bulb or a visitor’s laptop, they could potentially hammer my AI endpoints with requests, consume GPU resources, or worse, use them as a pivot point into more sensitive systems.
I needed isolation without breaking functionality. VLAN tagging and firewall rules became my solution.
My Network Before Segmentation
My setup ran on Proxmox with several VMs and containers. Ollama ran in an LXC container, and LM Studio ran on a Windows VM with GPU passthrough. Both exposed their APIs on standard ports:
- Ollama: 11434
- LM Studio: 1234
Everything connected through a single flat network (192.168.1.0/24). My Synology NAS, n8n instance, desktop, and AI servers all shared the same broadcast domain. No VLANs, no access controls beyond basic firewall rules that allowed everything from my LAN.
This meant any device on my network could talk to any other device without restriction.
Setting Up VLANs in Proxmox
I created three VLANs to separate traffic:
- VLAN 10: Management and trusted devices (my desktop, admin tools)
- VLAN 20: AI workloads (Ollama, LM Studio)
- VLAN 30: General devices (IoT, guest access)
Proxmox doesn’t create VLANs directly—it tags traffic at the network interface level. I edited /etc/network/interfaces on my Proxmox host to define VLAN-aware bridges:
auto vmbr0
iface vmbr0 inet manual
bridge-ports enp3s0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 10 20 30
This allowed my physical switch (a managed TP-Link) to handle VLAN tagging while Proxmox passed tagged frames to VMs and containers. I assigned each VM/container to its VLAN by setting the VLAN tag in the Proxmox GUI under Network Device settings.
My Ollama container got VLAN 20. My LM Studio VM got VLAN 20. My desktop stayed on VLAN 10. My IoT devices moved to VLAN 30.
The Problem I Hit Immediately
After applying VLAN tags, nothing could talk to anything else. My desktop on VLAN 10 couldn’t reach Ollama on VLAN 20. My n8n instance (also VLAN 10) couldn’t trigger LM Studio workflows.
VLANs isolate by default. I needed routing rules to allow specific traffic between segments.
Configuring Inter-VLAN Routing
I used pfSense as my router/firewall (running in a Proxmox VM). It handles routing between VLANs and enforces access policies.
I created firewall rules that allowed:
- VLAN 10 (management) → VLAN 20 (AI) on ports 11434 and 1234
- VLAN 20 (AI) → Internet for model downloads
- VLAN 30 (general) → blocked from VLAN 20 entirely
Here’s what the pfSense rule looked like for allowing my desktop to reach Ollama:
Action: Pass Interface: VLAN10 Protocol: TCP Source: VLAN10 net Destination: 192.168.20.10 (Ollama container) Destination Port: 11434
I created a similar rule for LM Studio on port 1234. For n8n workflows, I added the n8n container’s IP (192.168.10.50) to an alias called “AI_CLIENTS” and referenced it in the source field instead of the entire VLAN 10 subnet.
This gave me granular control—only specific devices could talk to AI servers, and only on the ports they needed.
Locking Down Ollama and LM Studio Directly
VLANs and firewall rules handled network-level isolation, but I also configured the AI tools themselves to bind only to their VLAN interfaces.
Ollama Configuration
By default, Ollama listens on 127.0.0.1:11434, which is localhost-only. I needed it to accept requests from my VLAN 10 devices, so I set the OLLAMA_HOST environment variable in my LXC container:
export OLLAMA_HOST=192.168.20.10:11434
I added this to /etc/environment so it persisted across reboots. Now Ollama only listens on its VLAN 20 IP address. Devices on other VLANs can’t reach it unless pfSense explicitly allows the traffic.
LM Studio Configuration
LM Studio doesn’t have a config file for binding addresses. I had to set it through the GUI under Server Settings. I changed the bind address from 0.0.0.0 (all interfaces) to 192.168.20.20 (its VLAN 20 IP).
This prevented it from accidentally listening on other network interfaces, like a VPN tunnel or a secondary NIC.
What Broke and How I Fixed It
n8n Couldn’t Resolve Hostnames
My n8n workflows used http://ollama.local:11434 to reach Ollama. After VLANs, DNS resolution stopped working because mDNS (Avahi) doesn’t cross VLAN boundaries.
I fixed this by updating my pfSense DNS resolver (Unbound) to serve static host entries for my AI servers:
ollama.local → 192.168.20.10 lmstudio.local → 192.168.20.20
Now n8n resolves these names correctly, and I don’t have to hardcode IP addresses in workflows.
Model Downloads Failed
Ollama needs internet access to download models from Hugging Face and its own registry. My initial firewall rules blocked all outbound traffic from VLAN 20.
I added a rule allowing VLAN 20 to reach the internet on port 443 (HTTPS):
Action: Pass Interface: VLAN20 Protocol: TCP Source: VLAN20 net Destination: any Destination Port: 443
This let Ollama pull models without opening unnecessary ports.
GPU Passthrough Stopped Working
When I moved my LM Studio VM to VLAN 20, GPU passthrough broke. Proxmox couldn’t initialize the GPU because the VM’s network configuration changed.
This had nothing to do with VLANs—it was a PCI passthrough issue caused by restarting the VM during network changes. I had to reboot the Proxmox host to reset the GPU state. After that, passthrough worked normally.
Monitoring and Logging
I set up pfSense logging to track connections to my AI servers. Under Firewall > Rules, I enabled logging for the rules that allow VLAN 10 → VLAN 20 traffic.
Logs show up in Status > System Logs > Firewall. I can see which devices are hitting Ollama and LM Studio, how often, and whether any unexpected sources are trying to connect.
I also run a simple script on my Proxmox host to log active connections to port 11434:
#!/bin/bash while true; do ss -tn | grep :11434 >> /var/log/ollama-connections.log sleep 60 done
This runs in a systemd service and gives me a record of who’s using Ollama over time.
What This Setup Gives Me
- AI servers are isolated from general network traffic
- Only specific devices can reach AI endpoints
- IoT devices and guest WiFi have no path to AI resources
- I can see exactly who’s using AI tools and when
- If something on VLAN 30 gets compromised, it can’t pivot to my AI infrastructure
The trade-off is complexity. I now have to manage VLAN tags, firewall rules, and DNS entries. If I add a new device that needs AI access, I have to update pfSense rules. If I change IP addresses, I have to update DNS.
But for me, the security and visibility are worth it.
What I’d Do Differently
I should have set up VLANs before running AI workloads in production. Retrofitting segmentation meant downtime and troubleshooting DNS issues that wouldn’t exist if I’d planned ahead.
I also wish I’d documented my firewall rules better from the start. I ended up with overlapping rules that I had to clean up later. A simple spreadsheet tracking which VLANs can talk to which services would have saved time.
Finally, I didn’t account for IPv6. My network uses IPv6, and I only configured IPv4 firewall rules. Devices with IPv6 addresses could bypass my restrictions until I added equivalent IPv6 rules in pfSense.
Key Takeaways
- VLANs alone don’t secure anything—you need firewall rules to control inter-VLAN traffic
- Bind AI services to specific interfaces, not 0.0.0.0
- DNS resolution breaks across VLANs unless you configure static entries or enable mDNS forwarding
- Log firewall activity so you can see what’s actually happening
- Plan VLAN topology before deploying services, not after
Network segmentation isn’t complicated, but it requires attention to detail. If you skip steps or assume things will “just work,” you’ll spend hours debugging why nothing can talk to anything else.