database-backup
Database Backup
This skill enables an AI agent to plan and execute database backup strategies across PostgreSQL, MySQL, and MongoDB. The agent selects the appropriate backup type (full, incremental, differential, or point-in-time), generates backup scripts with compression and encryption, configures automated scheduling via cron or similar tools, defines retention policies, and verifies backup integrity through restore tests.
Workflow
-
Assess backup requirements: Determine the database type, size, acceptable data loss window (Recovery Point Objective), and acceptable downtime (Recovery Time Objective). Identify whether the backup must be consistent (application-level locks or snapshots) and whether the database can tolerate brief locking during the backup.
-
Select backup strategy: Choose the appropriate backup type based on requirements. Full backups capture the entire database and are simplest to restore but slowest to create. Incremental backups capture only changes since the last backup, saving time and storage. Differential backups capture changes since the last full backup, offering a middle ground. Point-in-time recovery (PITR) uses write-ahead logs or oplogs to restore to any moment, providing the lowest RPO.
-
Generate backup scripts: Produce shell scripts that invoke the correct backup tool for the target database. Include compression (gzip, lz4, or zstd), optional encryption (GPG or OpenSSL), timestamped filenames, and error handling with exit codes. Script should log output for monitoring.
-
Configure scheduling and retention: Set up automated execution using cron, systemd timers, or a task scheduler. Define a retention policy (e.g., keep 7 daily, 4 weekly, 12 monthly backups) and implement cleanup of expired backups to manage storage usage.
-
Verify backup integrity: After each backup, verify the file is non-empty and checksums are valid. Periodically perform a test restore to a staging environment to confirm the backup is actually recoverable. Alert on verification failures.
-
Document and monitor: Record the backup schedule, retention policy, storage location, and restore procedure. Set up monitoring alerts for missed backups or backup failures using tools like Prometheus, Datadog, or simple email notifications.
Supported Technologies
- PostgreSQL: pg_dump, pg_basebackup, WAL archiving for PITR
- MySQL: mysqldump, mysqlpump, Percona XtraBackup, binlog for PITR
- MongoDB: mongodump, mongorestore, filesystem snapshots, oplog for PITR
- Compression: gzip, lz4, zstd, pigz
- Encryption: GPG, OpenSSL AES-256
- Storage: Local disk, AWS S3, Google Cloud Storage, Azure Blob Storage
Usage
Provide the agent with your database type, name, and connection details. Specify your RPO/RTO requirements, preferred storage destination, and whether you need encryption. The agent will generate a complete backup solution including scripts, scheduling, retention policy, and verification steps.
Examples
Example 1: PostgreSQL Full Backup with Cron Schedule
Request: Set up a nightly full backup of a PostgreSQL database with 30-day retention, compressed and uploaded to a local backup directory.
#!/usr/bin/env bash
# pg_backup.sh — Nightly PostgreSQL full backup
set -euo pipefail
DB_NAME="production"
DB_USER="backup_user"
BACKUP_DIR="/var/backups/postgresql"
RETENTION_DAYS=30
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/${DB_NAME}_${TIMESTAMP}.sql.gz"
mkdir -p "$BACKUP_DIR"
echo "[$(date)] Starting backup of database '${DB_NAME}'..."
pg_dump -U "$DB_USER" -h localhost -Fc "$DB_NAME" | gzip > "$BACKUP_FILE"
# Verify the backup file is non-empty
if [ ! -s "$BACKUP_FILE" ]; then
echo "[$(date)] ERROR: Backup file is empty. Aborting." >&2
exit 1
fi
FILE_SIZE=$(du -h "$BACKUP_FILE" | cut -f1)
echo "[$(date)] Backup complete: ${BACKUP_FILE} (${FILE_SIZE})"
# Clean up backups older than retention period
find "$BACKUP_DIR" -name "${DB_NAME}_*.sql.gz" -mtime +${RETENTION_DAYS} -delete
echo "[$(date)] Expired backups removed (older than ${RETENTION_DAYS} days)."
Cron entry (runs daily at 2:00 AM):
0 2 * * * /usr/local/bin/pg_backup.sh >> /var/log/pg_backup.log 2>&1
Example 2: MongoDB Backup and Restore Procedure
Request: Back up a MongoDB replica set and document the restore procedure.
Backup script:
#!/usr/bin/env bash
# mongo_backup.sh — MongoDB replica set backup
set -euo pipefail
MONGO_URI="mongodb://backup_user:secret@rs1.example.com:27017,rs2.example.com:27017/admin?replicaSet=rs0"
BACKUP_DIR="/var/backups/mongodb"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
DUMP_DIR="${BACKUP_DIR}/dump_${TIMESTAMP}"
mkdir -p "$BACKUP_DIR"
echo "[$(date)] Starting mongodump..."
mongodump --uri="$MONGO_URI" --oplog --out="$DUMP_DIR"
# Compress the dump directory
tar -czf "${DUMP_DIR}.tar.gz" -C "$BACKUP_DIR" "dump_${TIMESTAMP}"
rm -rf "$DUMP_DIR"
echo "[$(date)] Backup complete: ${DUMP_DIR}.tar.gz"
# Retain only the last 14 backups
ls -t ${BACKUP_DIR}/dump_*.tar.gz | tail -n +15 | xargs -r rm --
Restore procedure:
# 1. Extract the backup archive
tar -xzf /var/backups/mongodb/dump_20250115_020000.tar.gz -C /tmp/
# 2. Restore to the target MongoDB instance
mongorestore --uri="mongodb://admin:secret@localhost:27017" \
--oplogReplay --drop /tmp/dump_20250115_020000/
# 3. Verify collections and document counts
mongosh --eval "db.adminCommand({listDatabases: 1})"
Best Practices
- Test restores regularly — a backup you have never restored is a backup you cannot trust. Schedule monthly restore drills to a staging environment.
- Encrypt backups at rest and in transit using GPG or AES-256, especially when storing offsite or in cloud storage. Never store encryption keys alongside the backups.
- Use the --oplog flag (MongoDB) or WAL archiving (PostgreSQL) to enable point-in-time recovery, which dramatically reduces potential data loss.
- Store backups in a separate failure domain — a different server, availability zone, or cloud region — so a single infrastructure failure does not destroy both the database and its backups.
- Monitor backup jobs with alerts — silence is not success. Alert on missing backups, zero-byte files, or checksums that do not match.
- Document the full restore procedure with exact commands, expected timings, and who is responsible, so recovery can happen under pressure without guesswork.
Edge Cases
- Large databases exceeding disk space: For databases larger than available local storage, stream backups directly to object storage (e.g.,
pg_dump | gzip | aws s3 cp - s3://bucket/backup.gz) to avoid local disk exhaustion. - Active write traffic during backup: Use
pg_dumpwith--snapshotor MongoDB's--oplogto get a consistent point-in-time backup even while writes continue. For MySQL, use Percona XtraBackup for hot backups of InnoDB without locking. - Backup of encrypted databases: If Transparent Data Encryption (TDE) is enabled, ensure the encryption keys are backed up separately and that the backup process captures encrypted data in a restorable format.
- Cross-region replication lag: When backing up from a replica, verify replication lag is zero or near-zero before starting the backup to avoid capturing stale data.
- Backup credential rotation: Store credentials in a secrets manager (Vault, AWS Secrets Manager) rather than hardcoding them in scripts, and ensure backup scripts can handle credential rotation without manual updates.
More from seb1n/awesome-ai-agent-skills
proofreading
Proofread and correct text for grammar, spelling, punctuation, style, clarity, and consistency, with support for multiple style guides and readability analysis.
15testing
Generate, execute, and analyze tests for codebases, covering unit, integration, and end-to-end testing with coverage reporting.
11project management
Manages software projects end-to-end — decomposing work into tasks, tracking progress across sprints, generating status reports, and integrating with tools like Jira, Linear, GitHub Issues, and Trello.
11user flow mapping
Visualize and map user flows with Mermaid diagrams, decision points, error states, and conversion metrics to optimize user journeys.
11task-automation
Automate repetitive tasks and workflows using scripting, file watchers, scheduled jobs, CI triggers, and API polling to eliminate manual toil.
11data-analysis
Analyze datasets to extract insights through statistical methods, trend identification, hypothesis testing, and correlation analysis.
11