Docker Compose Best Practices for Self-Hosting

If you've been self-hosting for a while, you've probably accumulated a messy collection of docker-compose files scattered across random directories. Maybe you've got secrets hardcoded in YAML files, networks you don't remember creating, and containers that crash silently without anyone noticing.

Let's fix that. Here's how to structure your Docker Compose setup like someone who actually knows what they're doing.

Folder Structure That Doesn't Suck

The single biggest improvement you can make is organizing your files consistently. Here's a structure that scales from a few services to dozens:

~/docker/
├── traefik/
│   ├── docker-compose.yml
│   ├── .env
│   └── config/
│       └── traefik.yml
├── nextcloud/
│   ├── docker-compose.yml
│   ├── .env
│   └── data/
├── jellyfin/
│   ├── docker-compose.yml
│   ├── .env
│   └── config/
└── monitoring/
    ├── docker-compose.yml
    ├── .env
    └── prometheus/
        └── prometheus.yml

Each service gets its own directory. Each directory contains the compose file, an env file, and any config or data subdirectories. This makes it dead simple to back up a service, move it to another server, or nuke it entirely.

Some people prefer a single monolithic compose file with everything in it. That works until it doesn't. When you need to restart just Jellyfin without touching your database containers, you'll wish you'd split things up.

Environment Files: Stop Hardcoding Secrets

If your compose file contains actual passwords, fix that today. Docker Compose automatically loads variables from a .env file in the same directory.

Here's a typical .env file for Nextcloud:

# .env
MYSQL_ROOT_PASSWORD=supersecretpassword
MYSQL_PASSWORD=nextclouddbpass
MYSQL_DATABASE=nextcloud
MYSQL_USER=nextcloud
NEXTCLOUD_ADMIN_USER=admin
NEXTCLOUD_ADMIN_PASSWORD=adminpassword
NEXTCLOUD_TRUSTED_DOMAINS=cloud.example.com

And the compose file references these variables:

services:
  db:
    image: mariadb:10.11
    environment:
      - MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
      - MYSQL_PASSWORD=${MYSQL_PASSWORD}
      - MYSQL_DATABASE=${MYSQL_DATABASE}
      - MYSQL_USER=${MYSQL_USER}
    volumes:
      - ./db:/var/lib/mysql

  nextcloud:
    image: nextcloud:latest
    environment:
      - MYSQL_HOST=db
      - MYSQL_DATABASE=${MYSQL_DATABASE}
      - MYSQL_USER=${MYSQL_USER}
      - MYSQL_PASSWORD=${MYSQL_PASSWORD}
      - NEXTCLOUD_ADMIN_USER=${NEXTCLOUD_ADMIN_USER}
      - NEXTCLOUD_ADMIN_PASSWORD=${NEXTCLOUD_ADMIN_PASSWORD}
      - NEXTCLOUD_TRUSTED_DOMAINS=${NEXTCLOUD_TRUSTED_DOMAINS}
    volumes:
      - ./data:/var/www/html
    depends_on:
      - db

Add .env to your .gitignore immediately. Commit a .env.example file with placeholder values so you remember what variables are needed.

Docker Secrets for Sensitive Data

Environment variables are fine for most things, but they show up in docker inspect output and process listings. For truly sensitive data, Docker secrets are better.

services:
  db:
    image: postgres:15
    environment:
      - POSTGRES_PASSWORD_FILE=/run/secrets/db_password
    secrets:
      - db_password

secrets:
  db_password:
    file: ./secrets/db_password.txt

Many official images support the _FILE suffix convention, which reads the secret from a file instead of an environment variable. Postgres, MySQL, MariaDB, and others all support this.

Keep your secrets directory outside of version control and set restrictive permissions:

chmod 600 ./secrets/*

Networking: Use Custom Networks

The default bridge network works, but custom networks give you DNS resolution between containers and better isolation. Create networks for logical groupings:

networks:
  frontend:
    name: frontend
  backend:
    name: backend

services:
  traefik:
    image: traefik:v3.0
    networks:
      - frontend

  app:
    image: myapp:latest
    networks:
      - frontend
      - backend

  db:
    image: postgres:15
    networks:
      - backend

In this setup, Traefik can reach the app, and the app can reach the database, but Traefik cannot directly access the database. Your database has no reason to be on the same network as your reverse proxy.

For services that need to communicate across different compose files, use external networks:

# Create the network once
docker network create proxy

# In traefik/docker-compose.yml
networks:
  proxy:
    external: true

# In any other service
networks:
  proxy:
    external: true

Labels: The Secret Sauce for Traefik

If you're using Traefik (and you probably should be), labels are how you configure routing without touching Traefik's config files. Here's a real example for Vaultwarden:

services:
  vaultwarden:
    image: vaultwarden/server:latest
    environment:
      - DOMAIN=https://vault.example.com
    volumes:
      - ./data:/data
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.vaultwarden.rule=Host(`vault.example.com`)"
      - "traefik.http.routers.vaultwarden.entrypoints=websecure"
      - "traefik.http.routers.vaultwarden.tls.certresolver=letsencrypt"
      - "traefik.http.services.vaultwarden.loadbalancer.server.port=80"
    networks:
      - proxy

Labels also work great for container management tools. Here's how to add metadata for Portainer or Homepage:

labels:
  - "homepage.group=Media"
  - "homepage.name=Jellyfin"
  - "homepage.icon=jellyfin.png"
  - "homepage.href=https://jellyfin.example.com"
  - "homepage.description=Media streaming"

Healthchecks: Know When Things Break

Containers can be "running" while completely broken. A healthcheck lets Docker (and you) know if a service is actually working.

services:
  postgres:
    image: postgres:15
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s

  redis:
    image: redis:7
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3

  nginx:
    image: nginx:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost"]
      interval: 30s
      timeout: 10s
      retries: 3

The start_period is important for services that take a while to initialize. During this period, failed health checks don't count against the retry limit.

You can also use healthchecks to control startup order more reliably than depends_on alone:

services:
  app:
    image: myapp:latest
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy

Now your app won't start until both the database and Redis are actually ready, not just running.

Putting It All Together

Here's a complete example for a typical self-hosted app stack with all these practices combined:

services:
  immich-server:
    image: ghcr.io/immich-app/immich-server:release
    environment:
      - DB_HOSTNAME=immich-db
      - DB_USERNAME=${DB_USERNAME}
      - DB_PASSWORD=${DB_PASSWORD}
      - DB_DATABASE_NAME=${DB_DATABASE}
      - REDIS_HOSTNAME=immich-redis
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
    depends_on:
      immich-db:
        condition: service_healthy
      immich-redis:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3001/api/server-info/ping"]
      interval: 30s
      timeout: 10s
      retries: 3
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.immich.rule=Host(`photos.example.com`)"
      - "traefik.http.routers.immich.entrypoints=websecure"
      - "traefik.http.routers.immich.tls.certresolver=letsencrypt"
      - "traefik.http.services.immich.loadbalancer.server.port=3001"
    networks:
      - proxy
      - immich-internal
    restart: unless-stopped

  immich-db:
    image: tensorchord/pgvecto-rs:pg14-v0.2.0
    environment:
      - POSTGRES_PASSWORD_FILE=/run/secrets/db_password
      - POSTGRES_USER=${DB_USERNAME}
      - POSTGRES_DB=${DB_DATABASE}
    volumes:
      - ./postgres:/var/lib/postgresql/data
    secrets:
      - db_password
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USERNAME} -d ${DB_DATABASE}"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s
    networks:
      - immich-internal
    restart: unless-stopped

  immich-redis:
    image: redis:7-alpine
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3
    networks:
      - immich-internal
    restart: unless-stopped

networks:
  proxy:
    external: true
  immich-internal:
    internal: true

secrets:
  db_password:
    file: ./secrets/db_password.txt

Quick Wins You Can Implement Today

Add restart: unless-stopped to every service. Your containers will survive reboots.
Pin your image versions. Using :latest is fine for testing, but postgres:15.4 is better for production.
Set resource limits with mem_limit and cpus for greedy applications.
Use docker compose logs -f servicename to tail logs when debugging.
Run docker compose config to validate your compose file and see the fully interpolated result.

Wrapping Up

Good Docker Compose hygiene isn't about being pedantic. It's about making your life easier at 2 AM when something breaks and you need to figure out what's wrong. A well-organized setup with proper healthchecks, isolated networks, and externalized configuration is one you can actually maintain.

Start with one service and get it right. Then apply the same patterns to everything else. Future you will be grateful.