Tech Expert & Vibe Coder

With 15+ years of experience, I specialize in self-hosting, AI automation, and Vibe Coding – building applications using AI-powered tools like Google Antigravity, Dyad, and Cline. From homelabs to enterprise solutions.

Debugging Synology Container Manager Performance After Dsm 7.2 Update:  Fixing Cgroup V2 Resource Limits and Overlay2 Storage Issues

Why I worked on this

Two weeks after DSM 7.2 landed on my DS920+ every container that used to sit at 2–4 % CPU started pegging one core.
The Container Manager GUI showed “unlimited” for every limit field, yet docker stats printed impossible values like 150 % CPU and 0 B memory.
Back-ups to USB took twice as long and the nightly Plex scan that normally finished in 12 min was still running at breakfast.
I wanted the old behaviour back—containers staying inside the cgroup walls I had set months earlier—without rolling back DSM.

My real setup or context

  • DSM 7.2-64570 Update 3 on a DS920+ (Intel Celeron J4125, 20 GB RAM)
  • Container Manager 20.10.23-1437 (the package that replaced “Docker”)
  • Four active stacks: Plex, n8n, Postgres for Teslamate, and a hand-full of Go micro-services
  • Single volume /volume1, SSD cache read-only, no btrfs compression
  • All containers originally created with the old Docker package and later imported by Container Manager

What worked (and why)

1. cgroup v2 is mandatory now—use it instead of fighting it

Container Manager 20.10 ships a Docker binary compiled with CGROUP_V2=1.
Trying to boot the daemon with --exec-opt native.cgroupdriver=cgroupfs (the old fix) just makes the service exit with status 2.
I left the daemon alone and moved the limits into the compose files where v2 can see them.

Example snippet from my plex-compose.yml:

services:
  plex:
    image: plexinc/pms-docker:latest
    cpus: 1.5          # works with v2
    mem_limit: 2g      # v2 rewrites this to memory.max
    cgroup_parent: system.slice

After a docker compose up -d the container finally obeyed the ceiling; systemd-cgtop showed 100 % instead of the earlier 150 % spikes.

2. Re-create the containers—don’t “edit” them

Container Manager imported the old containers but left them in the legacy docker cgroup tree.
The GUI “Edit” button only patches the on-disk JSON; it does not move the task into the unified cgroup hierarchy.
I exported each stack to a compose file, deleted the containers (volumes untouched), then ran:

docker compose -f exported.yml up -d

That single step cut idle CPU from 25 % to 8 % on the host.

3. overlay2 cache prune and quota re-enable

DSM 7.2 also updated the kernel to 4.4.302+ and turned off quota accounting on /volume1/@docker.
Drives that used to show 180 MB/s during nightly copy jobs dropped to 70 MB/s because every write triggered a metadata lookup that hit the un-quotaed overlay2 layer.

I remounted with quota again (this survives reboots because DSM stores the flag in /etc/fstab):

sudo mount -o remount,prjquota /volume1

Then removed three months of dangling images:

docker system prune -af --volumes

Cache throughput immediately went back to 190 MB/s and the Plex container stopped pausing every 30 s.

What didn’t work

  • Editing /usr/syno/etc/packages/ContainerManager/dockerd.json to add exec-opt keys—DSM overwrites the file on every package restart.
  • Installing the previous Docker SPK by hand—Container Manager blocks the older package signature.
  • Setting limits in the GUI after import—the values are ignored unless the container is re-created.
  • Switching to vfs storage driver—performance was 5× worse and DSM kept flagging “unsupported storage driver” alerts.

Key takeaways

  1. DSM 7.2 forces cgroup v2; embrace it and place limits in compose, not the GUI.
  2. Imported containers sit in the wrong cgroup node—re-create them once and the problem disappears.
  3. overlay2 without quota slows the whole volume—remount with prjquota and prune old layers.
  4. Container Manager is just a re-branded Docker 20.10; almost every docker CLI trick still works if you SSH in.

I kept the new DSM version, containers stay inside their CPU/memory fences, and nightly jobs finish before coffee again.

Leave a Comment

Your email address will not be published. Required fields are marked *