How to Monitor Your Homelab with Grafana and Prometheus
You've built your homelab. Maybe it's a single Proxmox node running a handful of containers, or maybe you've gone full rack mode with multiple servers humming away in a closet. Either way, there's one question that eventually hits every homelabber: "What's actually going on with my stuff?"
That's where monitoring comes in. And when it comes to monitoring in 2024, the combo of Prometheus and Grafana is basically the gold standard. It's what the pros use, it's free, and once you set it up, you'll wonder how you ever lived without those pretty graphs.
Let's get you set up.
Why Bother Monitoring?
Before we dive into the how, let's talk about the why. Monitoring your homelab gives you:
- Early warning signs - Catch a failing drive or runaway process before it takes down your Plex server during movie night
- Historical data - See trends over time. Is your RAM usage slowly creeping up? When did that start?
- Capacity planning - Know when you actually need to upgrade versus when you're just being paranoid
- The cool factor - Let's be honest, dashboards look awesome on that spare monitor
The Stack: What We're Building
Here's what each piece does:
- Prometheus - The data collector. It scrapes metrics from your services at regular intervals and stores them in a time-series database. Think of it as the warehouse where all your numbers live.
- Grafana - The visualization layer. It connects to Prometheus and turns those numbers into beautiful, interactive dashboards.
- Node Exporter - A small agent that exposes system metrics (CPU, memory, disk, network) in a format Prometheus understands.
- cAdvisor - Container metrics. If you're running Docker, this tells you what each container is doing.
The Docker Compose Setup
Here's a complete, working docker-compose.yml that you can drop into your homelab and run today. Create a new directory for your monitoring stack:
mkdir ~/monitoring && cd ~/monitoring
Create your docker-compose.yml:
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=30d'
- '--web.enable-lifecycle'
ports:
- "9090:9090"
networks:
- monitoring
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=changeme
- GF_USERS_ALLOW_SIGN_UP=false
ports:
- "3000:3000"
networks:
- monitoring
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--path.rootfs=/rootfs'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
ports:
- "9100:9100"
networks:
- monitoring
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
restart: unless-stopped
privileged: true
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
ports:
- "8080:8080"
networks:
- monitoring
networks:
monitoring:
driver: bridge
volumes:
prometheus_data:
grafana_data:
Now create the Prometheus configuration. Make a directory and config file:
mkdir prometheus
Create prometheus/prometheus.yml:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
# Add more hosts here as your homelab grows
# - job_name: 'other-server'
# static_configs:
# - targets: ['192.168.1.50:9100']
Fire it up:
docker compose up -d
Give it a minute to start, then hit http://your-server-ip:3000 and log in with admin/changeme (change this immediately, obviously).
Connecting Grafana to Prometheus
Once you're logged into Grafana:
- Go to Connections > Data sources
- Click Add data source
- Select Prometheus
- For the URL, enter:
http://prometheus:9090 - Scroll down and click Save & test
You should see a green "Successfully queried the Prometheus API" message. If you don't, double-check that both containers are on the same Docker network.
What Metrics Should You Track?
With Node Exporter and cAdvisor running, you've got access to hundreds of metrics. Here are the ones that actually matter for a homelab:
System Health Basics
- CPU Usage - Both overall and per-core. Helps identify if a single core is getting hammered.
- Memory Usage - Total, used, cached, and available. Linux loves to use RAM for cache, so "used" can be misleading.
- Disk Space - Nothing kills a server faster than a full disk. Set alerts at 80% and 90%.
- Disk I/O - Read/write speeds and IOPS. Useful for spotting bottlenecks.
- Network Traffic - Bytes in/out per interface. Great for seeing which service is hogging your bandwidth.
Container Metrics
- Container CPU - Which containers are working hardest?
- Container Memory - Spot memory leaks before they become problems.
- Container Network - See exactly how much traffic each container generates.
- Container Restarts - A container that keeps restarting is trying to tell you something.
Temperature (If Available)
Node Exporter can expose CPU and other hardware temps if your system supports it. Worth monitoring if your homelab lives somewhere warm or you're pushing overclocked hardware.
Setting Up Your First Dashboard
You could build dashboards from scratch, but why would you? The Grafana community has already done the hard work. Here are some excellent pre-built dashboards you can import in seconds:
Node Exporter Full (Dashboard ID: 1860)
This is the classic. It gives you everything about your host system in one comprehensive view. To import it:
- Go to Dashboards > New > Import
- Enter
1860in the "Import via grafana.com" field - Click Load
- Select your Prometheus data source
- Click Import
Boom. Instant professional-looking dashboard with CPU, memory, disk, network, and a ton more.
Docker and System Monitoring (Dashboard ID: 893)
Great for container-focused monitoring. Shows resource usage per container alongside system metrics.
cAdvisor Exporter (Dashboard ID: 14282)
A clean, modern dashboard focused specifically on container metrics from cAdvisor.
Building a Custom Overview Dashboard
Imported dashboards are great, but you'll probably want a custom "at a glance" dashboard for your specific setup. Here are some useful PromQL queries to get you started:
CPU Usage Percentage
100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
Memory Usage Percentage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100
Disk Usage Percentage (Root Filesystem)
100 - ((node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100)
Network Traffic (Received, per second)
irate(node_network_receive_bytes_total{device="eth0"}[5m])
Container Memory Usage
container_memory_usage_bytes{name!=""}
Container CPU Usage
rate(container_cpu_usage_seconds_total{name!=""}[5m]) * 100
Setting Up Alerts
Pretty graphs are nice, but alerts are where monitoring becomes actually useful. Grafana can send notifications to email, Discord, Slack, Telegram, and dozens of other services when things go wrong.
Some alerts every homelabber should have:
- Disk space above 85% - Gives you time to clean up or add storage
- Memory above 90% for 5+ minutes - Might indicate a memory leak
- Container restart count increasing - Something's crashlooping
- Host unreachable - If you're monitoring multiple machines
- High CPU for extended periods - Could be crypto mining malware or a runaway process
To set up alerts in Grafana, go to Alerting > Alert rules and create rules based on your PromQL queries. Then configure a contact point (Discord webhook, email, etc.) to receive notifications.
Monitoring Multiple Hosts
Got more than one server? No problem. Just install Node Exporter on each machine and add them to your Prometheus config:
- job_name: 'proxmox-node'
static_configs:
- targets: ['192.168.1.10:9100']
labels:
instance: 'proxmox'
- job_name: 'nas'
static_configs:
- targets: ['192.168.1.20:9100']
labels:
instance: 'synology'
After updating the config, tell Prometheus to reload:
curl -X POST http://localhost:9090/-/reload
Tips for Long-Term Success
- Set reasonable retention - The default 30 days is fine for most homelabs. Going longer eats disk space fast.
- Don't over-monitor - You don't need 1-second scrape intervals. 15-30 seconds is plenty for homelab use.
- Backup your Grafana - Those dashboard volumes contain your hard work. Include them in your backup strategy.
- Use labels wisely - Good labeling makes filtering and grouping in Grafana much easier.
- Start simple - Import one dashboard, get comfortable, then expand. You don't need to monitor everything on day one.
Wrapping Up
Monitoring might seem like overkill for a homelab, but trust me, the first time you catch a problem before it becomes a disaster, you'll be glad you set this up. Plus, there's something deeply satisfying about watching those graphs tick along, knowing exactly what your hardware is doing at any moment.
The Prometheus and Grafana combo gives you enterprise-grade monitoring for free. It scales from a single Raspberry Pi to a full rack of servers. And once you get the basics running, there's a whole world of exporters for specific applications: databases, web servers, smart home devices, and pretty much anything else you can think of.
Now go forth and graph all the things.