Migrating Ollama models from NVMe to ZFS datasets on TrueNAS SCALE without re-downloading 200GB model libraries

Why I Needed to Move 200GB of Ollama Models

I run Ollama on TrueNAS SCALE inside a Docker container. When I first set it up, I pointed the model storage to a fast NVMe drive because I wanted quick load times. That worked fine until I accumulated about 200GB of models—llama3.1, mistral, codellama, and several others I tested for different tasks.

The problem: my NVMe drive is only 500GB, and I use it for VMs, container images, and system operations. Storing AI models there was eating up space I needed for other things. Meanwhile, I have a ZFS pool with multiple terabytes sitting mostly empty.

The obvious solution was to move the models to ZFS storage. But re-downloading 200GB over my home connection would take days, and I’d already spent time pulling these models. I needed a way to physically move the files without breaking Ollama’s internal structure.

How Ollama Actually Stores Models

Before moving anything, I had to understand what I was working with. Ollama doesn’t just dump model files into a folder. It uses a specific directory structure:

/models/
├── blobs/
│   └── sha256-[hash files]
└── manifests/
    └── registry.ollama.ai/
        └── library/
            └── [model-name]/
                └── [tag files]

The actual model weights live in the blobs/ directory as content-addressed files—meaning each file is named by its SHA256 hash. The manifests/ directory contains JSON files that tell Ollama which blobs belong to which model and tag.

This matters because you can’t just copy one model’s files. Multiple models share layers (blobs), so moving them requires preserving the entire structure.

My Migration Process

Step 1: Created a ZFS Dataset

I created a dedicated dataset on my ZFS pool for Ollama models:

zfs create tank/ollama-models

I set the mountpoint to /mnt/tank/ollama-models and made sure the permissions matched what Docker would need (UID 1000 in my case, since that’s what the Ollama container runs as).

Step 2: Stopped the Ollama Container

This is non-negotiable. Moving files while Ollama is running will corrupt your models. I stopped the container through TrueNAS SCALE’s web UI, but you can also do:

docker stop ollama

Step 3: Copied the Entire Models Directory

I didn’t try to be clever about selective copying. I used rsync to move everything:

rsync -avh --progress /mnt/nvme/ollama/models/ /mnt/tank/ollama-models/

The -a flag preserves permissions and timestamps. The --progress flag let me watch the transfer, which took about 45 minutes for 200GB over my internal storage network.

I kept the trailing slash on the source path (models/) to copy the contents, not the directory itself.

Step 4: Updated the Docker Compose File

My original volume mount looked like this:

volumes:
  - /mnt/nvme/ollama/models:/root/.ollama/models

I changed it to:

volumes:
  - /mnt/tank/ollama-models:/root/.ollama/models

Then recreated the container:

docker-compose up -d

Step 5: Verified Everything Still Worked

I ran ollama list inside the container to confirm all models showed up:

docker exec ollama ollama list

Every model was there. I tested loading a few to make sure they actually worked and weren’t just listed but broken.

What Didn’t Work at First

I initially tried to be smart and only copy the models I actively used. I manually identified their manifest files and the corresponding blobs. This broke immediately because Ollama couldn’t find shared layers that other models referenced.

I also made the mistake of not checking permissions on the ZFS dataset. The first time I started the container, it couldn’t write to the directory. I had to go back and run:

chown -R 1000:1000 /mnt/tank/ollama-models

Another issue: I forgot to stop the container before copying files. Ollama was still writing to the old location while rsync was running, which caused some blob files to be incomplete. I had to delete the partial copy and start over.

Performance Differences

I was worried moving from NVMe to spinning rust (my ZFS pool uses HDDs with SSD cache) would slow down model loading. In practice, I didn’t notice much difference.

Model inference speed is bottlenecked by GPU and RAM, not disk I/O. The initial model load takes maybe 2-3 seconds longer on ZFS compared to NVMe, but once it’s in memory, performance is identical.

For my use case—running models for hours at a time—the extra few seconds on load is irrelevant. If you’re constantly switching between models, you might care more.

Why I Didn’t Use a Management Script

I found a Python script on GitHub (ollama-models-manager) that claims to handle copying and moving models safely. I looked at the code, and it does seem to understand Ollama’s structure.

But I didn’t use it for two reasons:

I wanted to understand exactly what was happening to my data. Running an automated script would have worked, but I wouldn’t have learned how Ollama’s storage actually works.
For a one-time migration, rsync was simpler. No dependencies, no Python environment to set up, just a single command.

If I were managing models across multiple systems or frequently moving them around, the script might be worth it. For my case, it was overkill.

What I’d Do Differently

If I had to do this again, I would have set up the ZFS dataset from the start. There was no good reason to use NVMe for model storage when I knew I’d eventually accumulate a large library.

I also would have documented the exact UID/GID that the Ollama container uses before starting the migration. I wasted 20 minutes troubleshooting permission errors that I could have avoided by checking this upfront.

One thing I should have done but didn’t: verify the rsync transfer with checksums. I trusted that rsync copied everything correctly, and it did, but running rsync --checksum on a second pass would have given me certainty.

Key Takeaways

Ollama’s model storage is content-addressed. You can’t selectively copy models without breaking references.
Always stop the Ollama service before moving model files.
Use rsync with -a to preserve permissions and timestamps.
Check container UID/GID and match filesystem permissions before starting the container.
Model inference performance is not meaningfully affected by moving from NVMe to ZFS with HDD storage.
For one-time migrations, simple tools like rsync are often better than specialized scripts.

The whole process took about an hour, including the time I wasted fixing permission errors. My NVMe drive now has 200GB of free space, and my models are safely stored on redundant ZFS storage where they belong.

Tech Expert & Vibe Coder

Why I Needed to Move 200GB of Ollama Models

How Ollama Actually Stores Models

My Migration Process

Step 1: Created a ZFS Dataset

Step 2: Stopped the Ollama Container

Step 3: Copied the Entire Models Directory

Step 4: Updated the Docker Compose File

Step 5: Verified Everything Still Worked

What Didn’t Work at First

Performance Differences

Why I Didn’t Use a Management Script

What I’d Do Differently

Key Takeaways

Category:

Debugging Llm Context Window...

Optimizing Sub-20kb Static...

Leave a Comment Cancel reply

Categories

Related Posts

Debugging Llm Context Window Limits in...

Optimizing Sub-20kb Static Sites on Caddy: ...

Building a Self-hosted Icloud Photos Downloader...

About Me

Vipin PG

Tech Expert & Vibe Coder

Migrating Ollama models from NVMe to ZFS datasets on TrueNAS SCALE without re-downloading 200GB model libraries

Why I Needed to Move 200GB of Ollama Models

How Ollama Actually Stores Models

My Migration Process

Step 1: Created a ZFS Dataset

Step 2: Stopped the Ollama Container

Step 3: Copied the Entire Models Directory

Step 4: Updated the Docker Compose File

Step 5: Verified Everything Still Worked

What Didn’t Work at First

Performance Differences

Why I Didn’t Use a Management Script

What I’d Do Differently

Key Takeaways

Category:

Debugging Llm Context Window...

Optimizing Sub-20kb Static...

Leave a Comment Cancel reply

Subscribe to Newsletter

Categories

Related Posts

Debugging Llm Context Window Limits in...

Optimizing Sub-20kb Static Sites on Caddy: ...

Building a Self-hosted Icloud Photos Downloader...

About Me

Vipin PG