Why I Worked on This
I run a Proxmox host with multiple Docker stacks managed through Portainer. Some containers run background jobs that occasionally spike CPU or memory usage, which can starve other services on the same node. Docker's built-in resource limits work at the container level, but they don't help when one process inside a container monopolizes resources while others wait.
I needed finer control—per-process limits within containers—without rewriting my entire stack or splitting every workload into separate containers. That meant working directly with cgroups v2 inside running containers.
My Real Setup
My Proxmox host runs Debian 12, which uses cgroups v2 by default. I manage containers through Portainer, but all resource isolation work happens at the Docker CLI level. The host has:
- Docker Engine 24.x
- Portainer CE for stack management
- Multiple containers running n8n workflows, monitoring tools, and background scrapers
The problem container was running n8n with several workflows. One workflow would occasionally spawn a memory-hungry process that caused the entire container to hit its limit, pausing other workflows mid-execution.
What Worked
Verifying Cgroups v2
First, I confirmed the host was using cgroups v2:
stat -fc %T /sys/fs/cgroup
Output was cgroup2fs, which meant I was working with the unified hierarchy. This matters because the file structure and limit syntax differ between v1 and v2.
Bind-Mounting the Cgroup Filesystem
I needed the container to access its own cgroup subtree. I added a bind mount to the Docker run command:
docker run -d \
--name n8n-isolated \
-v /sys/fs/cgroup:/sys/fs/cgroup:rw \
n8nio/n8n:latest
The :rw flag is required because the container needs write access to create sub-cgroups. This does not give the container access to the entire host—it can only modify its own cgroup path.
Finding the Container's Cgroup Path
Inside the running container, I checked where Docker placed it in the cgroup hierarchy:
cat /proc/self/cgroup
Output looked like:
0::/docker/a1b2c3d4e5f6
The path after :: is relative to /sys/fs/cgroup. So the full path on the host (and inside the container via the bind mount) was:
/sys/fs/cgroup/docker/a1b2c3d4e5f6
Creating a Sub-Cgroup for the Noisy Process
I created a sub-cgroup called worker to isolate the memory-hungry workflow process:
mkdir -p /sys/fs/cgroup/docker/a1b2c3d4e5f6/worker
Then I set a memory limit of 512MB on this sub-cgroup:
echo 536870912 > /sys/fs/cgroup/docker/a1b2c3d4e5f6/worker/memory.max
The number is bytes (512 * 1024 * 1024). Cgroups v2 uses memory.max, not memory.limit_in_bytes like v1.
Assigning the Process to the Sub-Cgroup
I identified the PID of the problematic workflow process using ps aux inside the container. Then I moved it into the sub-cgroup:
echo 1234 > /sys/fs/cgroup/docker/a1b2c3d4e5f6/worker/cgroup.procs
Note that cgroups v2 uses cgroup.procs, not tasks. This moves the entire process group, not just a single thread.
Testing the Limit
I triggered the workflow and watched memory usage with cat /sys/fs/cgroup/docker/a1b2c3d4e5f6/worker/memory.current. When the process tried to exceed 512MB, it was killed by the OOM killer. The rest of the container continued running normally.
This was exactly what I needed—the noisy process was isolated and couldn't take down the entire container.
What Didn't Work
Using Docker Compose Volume Syntax
I initially tried adding the bind mount in my docker-compose.yml:
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:rw
This worked, but Portainer's stack editor doesn't show warnings when you use potentially risky mounts. I had to document this separately because it's easy to forget why that volume exists months later.
Automating Process Assignment
I tried writing a script to automatically detect and move high-memory processes into the sub-cgroup. The problem: by the time the script detected high usage, the process had already caused issues. Cgroups enforce limits, but they don't predict behavior.
I ended up hardcoding the PID assignment for known problematic workflows. Not elegant, but reliable.
CPU Limits
I also tried setting CPU limits using cpu.max:
echo "50000 100000" > /sys/fs/cgroup/docker/a1b2c3d4e5f6/worker/cpu.max
This should limit the process to 50% of one CPU core. It worked in testing, but in production, the workflow process would occasionally hang instead of throttling smoothly. I suspect this was due to how n8n handles async operations, but I didn't dig deeper. I removed the CPU limit and kept only the memory constraint.
Portainer UI Integration
Portainer doesn't expose cgroup controls in its UI. Everything I did required SSH access to the container or host. This isn't a failure of cgroups—it's just a gap in the tooling. If you rely on Portainer's UI for everything, this approach won't feel native.
Security Trade-offs
Bind-mounting /sys/fs/cgroup into a container is not the same as running with --privileged, but it's not risk-free either. The container can modify its own cgroup subtree, which means a compromised process could remove its own limits.
I decided this was acceptable because:
- The container runs trusted code (n8n workflows I wrote)
- The alternative—splitting every workflow into its own container—would complicate my Portainer setup
- The host's cgroup hierarchy is still protected; the container can't modify other containers' limits
If I were running untrusted workloads, I wouldn't use this approach. I'd use Kubernetes with proper pod resource quotas instead.
Key Takeaways
- Docker's resource limits apply to the entire container. If you need per-process control, you have to work with cgroups directly.
- Cgroups v2 uses a unified hierarchy and different file names than v1 (
memory.maxinstead ofmemory.limit_in_bytes,cgroup.procsinstead oftasks). - Bind-mounting
/sys/fs/cgroupwith:rwgives the container enough access to manage sub-cgroups without full privileges. - CPU limits can cause unexpected hangs with async workloads. Test thoroughly before using them in production.
- Portainer doesn't expose cgroup controls. You'll need CLI access for this kind of work.
- Automating process assignment is harder than it sounds. Manual PID assignment worked better for my use case.
When This Approach Makes Sense
This is useful if:
- You have a multi-process container where one process occasionally misbehaves
- Splitting the container isn't practical (shared filesystem, complex inter-process communication, etc.)
- You're running on a cgroups v2 host (most modern Linux distros)
- You have SSH access to containers and don't rely solely on a web UI
It's not useful if:
- You're running untrusted code
- You need automatic, dynamic resource allocation
- Your host uses cgroups v1 (the syntax is different and more fragmented)
- You prefer orchestration tools like Kubernetes that handle this natively
For my Proxmox homelab running Portainer, this was the right balance between control and complexity. It solved the noisy neighbor problem without forcing me to redesign my entire container architecture.