Skip to main content

Disaster Recovery

Disaster recovery (DR) is about getting back online fast. Whether your server crashed, got hacked, or a bad deployment broke everything — rsync-based backups let you restore your services in minutes, not hours.

Recovery Scenarios

ScenarioWhat FailedRecovery Method
Server crashHardware/OS failureRestore files + DB to new server
Bad deploymentCode broke the siteRestore from last known-good backup
Security breachSite hacked/malwareRestore from pre-compromise backup
Accidental deletionFiles/DB deleted by mistakeRestore specific files from snapshot
Data center failureEntire facility downFailover to standby or rebuild from offsite

Restoring from Backups

Restore Full Site (Files + Database)

# Step 1: Restore files from backup
rsync -av /backup/2024-01-15/files/ /var/www/html/

# Step 2: Restore database
gunzip < /backup/2024-01-15/databases.sql.gz | mysql -u root -p

# Step 3: Fix permissions
sudo chown -R www-data:www-data /var/www/html

# Step 4: Restart web server
sudo systemctl restart nginx # or apache2, openlitespeed, etc.

Restore from Incremental Snapshot

With --link-dest backups, every snapshot is a complete copy:

# List available snapshots
ls -la /backup/
# 2024-01-13/
# 2024-01-14/
# 2024-01-15/
# latest -> 2024-01-15

# Restore from any date
rsync -av /backup/2024-01-14/files/ /var/www/html/

Restore Specific Files Only

# Restore a single config file
cp /backup/latest/files/wp-config.php /var/www/html/

# Restore an entire directory
rsync -av /backup/latest/files/wp-content/plugins/ /var/www/html/wp-content/plugins/

# Restore from a specific date
rsync -av /backup/2024-01-10/files/wp-content/uploads/2024/ \
/var/www/html/wp-content/uploads/2024/

Restoring to a New Server

When the original server is unrecoverable:

# On the new server:

# Step 1: Install prerequisites
sudo apt update && sudo apt install -y nginx mariadb-server php-fpm rsync

# Step 2: Pull backup from remote storage
rsync -avzP user@backup-server:/backups/latest/ /restore/

# Step 3: Restore files
rsync -av /restore/files/ /var/www/html/

# Step 4: Restore database
gunzip < /restore/databases.sql.gz | mysql -u root -p

# Step 5: Update DNS to point to new server IP
echo "Update DNS A record to: $(curl -s ifconfig.me)"

Hot Standby Server

Keep a secondary server permanently synced and ready to take over:

# Hourly sync to standby (via cron)
0 * * * * rsync -avz --delete \
/var/www/html/ standby@dr-server:/var/www/html/

# Hourly database sync
0 * * * * mysqldump --defaults-file=/root/.my.cnf \
--single-transaction --all-databases | \
ssh standby@dr-server "mysql"

Failover Process

flowchart LR
P["Primary Server<br/>(fails)"] -. "rsync hourly" .-> S["Standby Server<br/>(takes over)"]
DNS["DNS Record"] --> P
DNS -. "update during failover" .-> S
  1. Detect primary server failure
  2. Verify standby is current
  3. Update DNS to point to standby IP
  4. Monitor until primary is rebuilt

Automated Recovery Script

#!/bin/bash
# disaster-recovery.sh — Restore a server from backup
set -euo pipefail

BACKUP_SOURCE="user@backup-server:/backups/latest"
WEB_ROOT="/var/www/html"
LOG="/var/log/disaster-recovery.log"

echo "=== DR started: $(date) ===" | tee -a "$LOG"

# Restore files
echo "Restoring files..." | tee -a "$LOG"
rsync -avz --stats "$BACKUP_SOURCE/files/" "$WEB_ROOT/" >> "$LOG" 2>&1

# Restore database
echo "Restoring database..." | tee -a "$LOG"
ssh user@backup-server "cat /backups/latest/databases.sql.gz" | \
gunzip | mysql -u root >> "$LOG" 2>&1

# Fix permissions
echo "Fixing permissions..." | tee -a "$LOG"
sudo chown -R www-data:www-data "$WEB_ROOT"
sudo find "$WEB_ROOT" -type d -exec chmod 755 {} \;
sudo find "$WEB_ROOT" -type f -exec chmod 644 {} \;

# Restart services
sudo systemctl restart nginx php*-fpm mariadb

echo "=== DR complete: $(date) ===" | tee -a "$LOG"
echo " Site should be live. Verify at: https://your-domain.com"

DR Drills (Testing Your Recovery)

warning

A recovery plan you've never tested is not a plan — it's a wish.

Monthly Restore Test

# Restore to a staging server
rsync -av /backup/latest/files/ /var/www/staging/
gunzip < /backup/latest/databases.sql.gz | mysql staging_db

# Verify the restored site works
curl -s -o /dev/null -w "%{http_code}" http://staging.example.com
# Should return 200

DR Checklist

  • Can you locate your most recent backup?
  • Do you know which server/path holds offsite copies?
  • Can you restore both files and database?
  • Do you have SSH access to backup storage?
  • Have you tested a full restore in the last 30 days?
  • Is your recovery time acceptable for your SLA?

Common Pitfalls

PitfallConsequencePrevention
Restoring without fixing permissionsSite errors (403, 500)chown and chmod after restore
Restoring files but forgetting databaseSite looks broken, missing contentAlways restore both
Restoring a compromised backupMalware comes backIdentify compromise date, restore from before it
No offsite backupServer room fire = total lossAlways keep 1+ offsite copy
Never testing recoveryDiscover backup is incomplete during crisisMonthly DR drills
DNS propagation delayUsers still hitting old serverUse low TTL on DNS records

Quick Reference

# Restore files from latest backup
rsync -av /backup/latest/files/ /var/www/html/

# Restore database
gunzip < /backup/latest/databases.sql.gz | mysql -u root -p

# Restore from specific date
rsync -av /backup/2024-01-14/files/ /var/www/html/

# Restore from remote backup
rsync -avzP user@backup:/backups/latest/files/ /var/www/html/

# Fix permissions after restore
sudo chown -R www-data:www-data /var/www/html

What's Next