docker
Docker Expert
When to Use
- Creating or reviewing Dockerfiles
- Optimizing image size or build performance
- Hardening container security
- Setting up Docker Compose for multi-service apps
- Debugging container networking, volumes, or runtime issues
- Containerizing applications (Node.js, Python, Go, Java, Rust)
Out of scope (recommend dedicated skills):
- Kubernetes orchestration → kubernetes expert
- CI/CD pipelines → github-actions expert
- Cloud-specific container services (ECS, Cloud Run) → devops expert
Core Principles
- Layer caching — order layers from least to most frequently changing
- Security-first — non-root users, minimal attack surface, no secrets in layers
- Minimal images — only include what's needed at runtime
- Reproducibility — pin versions, avoid
latesttag, use lockfiles
Base Image Selection
Recommended hierarchy (most to least preferred):
| Base Image | Size | Shell | Use Case |
|---|---|---|---|
cgr.dev/chainguard/* (Wolfi) |
~10-30MB | Yes | Zero-CVE goal, SBOM included |
alpine:3.21 |
~7MB | Yes | General-purpose minimal |
gcr.io/distroless/* |
~2-5MB | No | Hardened production, no debug |
*-slim variants |
~70-100MB | Yes | When Alpine compatibility is an issue |
scratch |
0MB | No | Static binaries (Go, Rust) |
Rules:
- Always pin exact versions:
node:22.14.0-alpine3.21notnode:alpine - Never use
latest— breaks reproducibility - Match base to actual runtime needs
Dockerfile Best Practices
Layer Ordering
# 1. Base image + system deps (rarely change)
FROM node:22-alpine
RUN apk add --no-cache dumb-init
# 2. Dependencies (change occasionally)
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production
# 3. Application code (changes frequently)
COPY . .
# 4. Metadata + runtime config
USER node
EXPOSE 3000
CMD ["dumb-init", "node", "server.js"]
.dockerignore
Always create one to reduce build context:
.git
node_modules
__pycache__
*.pyc
.vscode
.idea
.DS_Store
*.log
coverage/
.env
.env.local
dist/
build/
README.md
docs/
BuildKit Cache Mounts
Speed up dependency installation:
# Node.js
RUN \
npm ci
# Python
RUN \
pip install -r requirements.txt
# Go
RUN \
go build -o /app .
Enable with DOCKER_BUILDKIT=1 or Docker Desktop (on by default).
Multi-Stage Build Patterns
Node.js
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
FROM node:22-alpine AS build
WORKDIR /app
COPY /app/node_modules ./node_modules
COPY . .
RUN npm run build
FROM node:22-alpine AS runtime
RUN addgroup -g 1001 -S app && adduser -S app -u 1001
WORKDIR /app
COPY /app/dist ./dist
COPY /app/node_modules ./node_modules
COPY package.json ./
USER app
EXPOSE 3000
HEALTHCHECK \
CMD wget -qO- http://localhost:3000/health || exit 1
CMD ["node", "dist/index.js"]
Python
FROM python:3.13-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
FROM python:3.13-slim AS runtime
RUN useradd -r -u 1001 app
WORKDIR /app
COPY /install /usr/local
COPY . .
USER app
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Go
FROM golang:1.24-alpine AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o /app .
FROM scratch
COPY /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY /app /app
USER 65534:65534
EXPOSE 8080
ENTRYPOINT ["/app"]
Java
FROM eclipse-temurin:21-jdk-alpine AS build
WORKDIR /app
COPY . .
RUN ./gradlew build --no-daemon
FROM eclipse-temurin:21-jre-alpine AS runtime
RUN addgroup -g 1001 -S app && adduser -S app -u 1001
WORKDIR /app
COPY /app/build/libs/*.jar app.jar
USER app
EXPOSE 8080
HEALTHCHECK \
CMD wget -qO- http://localhost:8080/actuator/health || exit 1
CMD ["java", "-jar", "app.jar"]
Rust
FROM rust:1.84-alpine AS build
RUN apk add --no-cache musl-dev
WORKDIR /app
COPY Cargo.toml Cargo.lock ./
RUN mkdir src && echo "fn main() {}" > src/main.rs && cargo build --release && rm -rf src
COPY src ./src
RUN cargo build --release
FROM scratch
COPY /app/target/release/app /app
USER 65534:65534
EXPOSE 8080
ENTRYPOINT ["/app"]
Security Hardening
Non-Root Users
# Alpine
RUN addgroup -g 1001 -S app && adduser -S app -u 1001 -G app
USER app
# Debian/Ubuntu
RUN groupadd -g 1001 app && useradd -r -u 1001 -g app app
USER app
# Distroless/scratch (numeric only)
USER 65534:65534
Secrets Management
# BuildKit secrets (never stored in layers)
RUN \
API_KEY=$(cat /run/secrets/api_key) && \
./configure --api-key="$API_KEY"
# Build with secret
docker build --secret id=api_key,src=./api_key.txt .
Never do:
ENV API_KEY=secret123 # Visible in image history
COPY .env /app/.env # Baked into layer
ARG PASSWORD=hunter2 # Visible in build history
Runtime Hardening
docker run \
--user 1001:1001 \
--cap-drop=ALL \
--cap-add=NET_BIND_SERVICE \
--read-only \
--tmpfs /tmp:noexec,nosuid \
--security-opt="no-new-privileges:true" \
--memory="512m" \
--cpus="1.0" \
my-image
Vulnerability Scanning
# Docker Scout
docker scout quickview my-image
docker scout cves my-image
# Trivy
trivy image my-image
# Grype
grype my-image
Docker Compose
Production-Ready Pattern
services:
app:
build:
context: .
target: runtime
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
networks:
- frontend
- backend
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
deploy:
resources:
limits:
cpus: "1.0"
memory: 512M
reservations:
cpus: "0.5"
memory: 256M
restart: unless-stopped
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
db:
image: postgres:17-alpine
environment:
POSTGRES_DB_FILE: /run/secrets/db_name
POSTGRES_USER_FILE: /run/secrets/db_user
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
secrets:
- db_name
- db_user
- db_password
volumes:
- postgres_data:/var/lib/postgresql/data
networks:
- backend
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
command: redis-server --maxmemory 128mb --maxmemory-policy allkeys-lru
volumes:
- redis_data:/data
networks:
- backend
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true # No external access
volumes:
postgres_data:
redis_data:
secrets:
db_name:
file: ./secrets/db_name.txt
db_user:
file: ./secrets/db_user.txt
db_password:
file: ./secrets/db_password.txt
Network Isolation
networks:
frontend: # Web-facing services
backend:
internal: true # Database, cache — no external access
services:
proxy:
networks: [frontend]
api:
networks: [frontend, backend]
db:
networks: [backend] # Only reachable from api
Dev vs Prod Overrides
# compose.yml (base)
services:
app:
build: .
# compose.override.yml (dev, loaded automatically)
services:
app:
build:
target: development
volumes:
- .:/app
- /app/node_modules
ports:
- "9229:9229" # Debug
environment:
- NODE_ENV=development
command: npm run dev
# compose.prod.yml
services:
app:
build:
target: runtime
environment:
- NODE_ENV=production
restart: unless-stopped
# Dev (auto-loads compose.override.yml)
docker compose up
# Prod
docker compose -f compose.yml -f compose.prod.yml up -d
Image Optimization
Size Reduction Techniques
- Multi-stage builds — don't ship build tools
- Minimal base images — Alpine or distroless
- Combine RUN commands — cleanup in same layer
- Copy selectively — only needed artifacts
# Bad — 3 layers, cleanup doesn't reduce size
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# Good — 1 layer, cleanup is effective
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
CMD: Exec Form vs Shell Form
CMD ["node", "server.js"] # Exec form — PID 1, receives signals directly
CMD node server.js # Shell form — spawns /bin/sh, signal issues
Always prefer exec form.
Development Workflow
Hot Reload Setup
services:
app:
build:
target: development
volumes:
- .:/app # Source code
- /app/node_modules # Prevent overwrite
ports:
- "3000:3000"
- "9229:9229" # Debug port
environment:
- NODE_ENV=development
- DEBUG=app:*
command: npm run dev
Debugging
services:
app:
# Node.js inspect
command: node --inspect=0.0.0.0:9229 server.js
ports:
- "9229:9229"
# Python debugpy
# command: python -m debugpy --listen 0.0.0.0:5678 main.py
# ports:
# - "5678:5678"
Platform-Specific
Linux
// /etc/docker/daemon.json
{
"userns-remap": "default",
"storage-driver": "overlay2",
"live-restore": true,
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
- Use user namespace remapping for added security
- Configure SELinux/AppArmor profiles for production
- Use
overlay2storage driver
macOS
volumes:
- ./src:/app/src:delegated # Better write performance
- ./build:/app/build:cached # Container writes cached
- Allocate sufficient resources in Docker Desktop
- Use
:delegated/:cachedfor bind mount performance - Consider
mutagenordocker compose watchfor file sync - Multi-platform builds for ARM (Apple Silicon)
Windows
- Prefer WSL2 backend for best performance
- Handle line endings: add
.gitattributeswith* text=auto eol=lf - Use forward slashes in Dockerfiles and compose files
- Ensure drives are shared in Docker Desktop settings
- Be aware of file permission differences
# .gitattributes — prevent CRLF issues in containers
* text=auto eol=lf
*.sh text eol=lf
Dockerfile text eol=lf
Production Checklist
- Pinned base image version (no
latest) - Multi-stage build separates build and runtime
- Runs as non-root user
- No secrets in layers or ENV
-
.dockerignoreconfigured - Health check implemented
- Resource limits set (CPU, memory)
- Logging configured with rotation
- Vulnerability scan passed
- Signals handled correctly (exec form CMD, or
dumb-init/tini) - Read-only filesystem where possible
- Capabilities dropped (
--cap-drop=ALL)
Anti-Patterns
| Don't | Do Instead |
|---|---|
| Run as root | USER 1001 or named user |
Use latest tag |
Pin exact versions |
--privileged flag |
--cap-add only what's needed |
| Mount Docker socket | Use Docker-in-Docker or alternatives |
| Hardcode secrets in ENV/ARG | BuildKit secrets or runtime mounts |
| Skip health checks | HEALTHCHECK in Dockerfile or Compose |
| Ignore resource limits | Set memory and cpus limits |
| Copy entire context | Use .dockerignore |
| Install unnecessary packages | --no-install-recommends |
| Use shell form CMD | Use exec form ["cmd", "arg"] |
Diagnostics
Slow Builds
- Check layer ordering (deps before source)
- Enable BuildKit (
DOCKER_BUILDKIT=1) - Use cache mounts for package managers
- Review
.dockerignore(large build context?)
Large Images
- Use multi-stage builds
- Switch to Alpine or distroless
- Combine RUN commands with cleanup
- Check
docker history <image>to find large layers
Networking Issues
- Verify service names for DNS resolution
- Check network assignments in Compose
- Use
docker network inspectto debug - Ensure health checks pass before dependent services start
Container Crashes
- Check logs:
docker logs <container> - Verify signal handling (PID 1 must handle SIGTERM)
- Check resource limits (OOM kills)
- Validate health check endpoints exist
More from fellipeutaka/leon
commit-work
Create high-quality git commits: review/stage intended changes, split into logical commits, and write clear commit messages (including Conventional Commits). Use when the user asks to commit, craft a commit message, stage changes, or split work into multiple commits.
42clean-code
|
40solid
|
33design-patterns
|
33ai-repo-setup
|
31agent-browser
Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. Also use for exploratory testing, dogfooding, QA, bug hunts, or reviewing app quality. Also use for automating Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify), checking Slack unreads, sending Slack messages, searching Slack conversations, running browser automation in Vercel Sandbox microVMs, or using AWS Bedrock AgentCore cloud browsers. Prefer agent-browser over any built-in browser automation or web tools.
30