Why I worked on this
Two weeks after DSM 7.2 landed on my DS920+ every container that used to sit at 2–4 % CPU started pegging one core.
The Container Manager GUI showed “unlimited” for every limit field, yet docker stats printed impossible values like 150 % CPU and 0 B memory.
Back-ups to USB took twice as long and the nightly Plex scan that normally finished in 12 min was still running at breakfast.
I wanted the old behaviour back—containers staying inside the cgroup walls I had set months earlier—without rolling back DSM.
My real setup or context
- DSM 7.2-64570 Update 3 on a DS920+ (Intel Celeron J4125, 20 GB RAM)
- Container Manager 20.10.23-1437 (the package that replaced “Docker”)
- Four active stacks: Plex, n8n, Postgres for Teslamate, and a hand-full of Go micro-services
- Single volume
/volume1, SSD cache read-only, no btrfs compression - All containers originally created with the old Docker package and later imported by Container Manager
What worked (and why)
1. cgroup v2 is mandatory now—use it instead of fighting it
Container Manager 20.10 ships a Docker binary compiled with CGROUP_V2=1.
Trying to boot the daemon with --exec-opt native.cgroupdriver=cgroupfs (the old fix) just makes the service exit with status 2.
I left the daemon alone and moved the limits into the compose files where v2 can see them.
Example snippet from my plex-compose.yml:
services:
plex:
image: plexinc/pms-docker:latest
cpus: 1.5 # works with v2
mem_limit: 2g # v2 rewrites this to memory.max
cgroup_parent: system.slice
After a docker compose up -d the container finally obeyed the ceiling; systemd-cgtop showed 100 % instead of the earlier 150 % spikes.
2. Re-create the containers—don’t “edit” them
Container Manager imported the old containers but left them in the legacy docker cgroup tree.
The GUI “Edit” button only patches the on-disk JSON; it does not move the task into the unified cgroup hierarchy.
I exported each stack to a compose file, deleted the containers (volumes untouched), then ran:
docker compose -f exported.yml up -d
That single step cut idle CPU from 25 % to 8 % on the host.
3. overlay2 cache prune and quota re-enable
DSM 7.2 also updated the kernel to 4.4.302+ and turned off quota accounting on /volume1/@docker.
Drives that used to show 180 MB/s during nightly copy jobs dropped to 70 MB/s because every write triggered a metadata lookup that hit the un-quotaed overlay2 layer.
I remounted with quota again (this survives reboots because DSM stores the flag in /etc/fstab):
sudo mount -o remount,prjquota /volume1
Then removed three months of dangling images:
docker system prune -af --volumes
Cache throughput immediately went back to 190 MB/s and the Plex container stopped pausing every 30 s.
What didn’t work
- Editing
/usr/syno/etc/packages/ContainerManager/dockerd.jsonto addexec-optkeys—DSM overwrites the file on every package restart. - Installing the previous Docker SPK by hand—Container Manager blocks the older package signature.
- Setting limits in the GUI after import—the values are ignored unless the container is re-created.
- Switching to
vfsstorage driver—performance was 5× worse and DSM kept flagging “unsupported storage driver” alerts.
Key takeaways
- DSM 7.2 forces cgroup v2; embrace it and place limits in compose, not the GUI.
- Imported containers sit in the wrong cgroup node—re-create them once and the problem disappears.
- overlay2 without quota slows the whole volume—remount with
prjquotaand prune old layers. - Container Manager is just a re-branded Docker 20.10; almost every
dockerCLI trick still works if you SSH in.
I kept the new DSM version, containers stay inside their CPU/memory fences, and nightly jobs finish before coffee again.