Mastering PostgreSQL Backups: A Step-by-Step Guide with pgBackRest
Data is the lifeblood of any application. While optimizing for performance is crucial, ensuring you can recover from a disaster is non-negotiable. In this guide, we will set up pgBackRest, a premier backup solution for PostgreSQL, to perform reliable, encrypted backups to S3-compatible storage.
We will use the configuration files from our previous optimization guide and expand on them with practical commands and verification steps.
Why pgBackRest?
Standard tools like pg_dump are great for logical backups but struggle with large databases and don't support Point-In-Time Recovery (PITR). pgBackRest offers:
- Parallel Processing: Fast backup and restore speeds.
- Incremental/Differential Backups: Save space and time.
- S3 Support: Native integration with object storage.
- Encryption: Secure your data at rest.
- PITR: Restore to a specific second in time.
Prerequisites
- OS: Ubuntu/Debian (or similar Linux distro)
- Database: PostgreSQL 16
- Storage: An S3-compatible bucket (AWS S3, Hetzner, MinIO, etc.)
1. Install pgBackRest
First, ensure the PostgreSQL repository is added (if not already), then install pgBackRest.
# Install pgBackRest
sudo apt-get update
sudo apt-get install pgbackrest
Step 1: Configuration
We need to configure pgBackRest to talk to our PostgreSQL cluster and our S3 bucket.
1. Create the Configuration File
Create or edit /etc/pgbackrest/pgbackrest.conf with the following content.
Note: We avoid hardcoding secrets. We'll handle those with environment variables.
# /etc/pgbackrest/pgbackrest.conf
[global]
# Local repository for metadata and lock files
repo1-path=/var/lib/pgbackrest
repo1-retention-full=3
repo1-retention-diff=3
# Repo 2: S3 Object Storage
repo2-type=s3
repo2-path=/
repo2-s3-endpoint=nbg1.your-object-storage.com
repo2-s3-bucket=your-bucket-name
repo2-s3-region=us-east-1
repo2-retention-full=3
repo2-retention-diff=3
# Encryption (Optional but recommended)
repo1-cipher-type=aes-256-cbc
repo2-cipher-type=aes-256-cbc
# Log settings
log-level-console=info
log-level-file=detail
start-fast=y
[main]
pg1-path=/var/lib/postgresql/16/main
2. Set Environment Variables for Secrets
For security, pass your S3 keys and encryption passwords via systemd overrides or a secure environment file source.
export PGBACKREST_REPO2_S3_KEY="your-access-key"
export PGBACKREST_REPO2_S3_KEY_SECRET="your-secret-key"
export PGBACKREST_REPO1_CIPHER_PASS="your-secure-passphrase"
export PGBACKREST_REPO2_CIPHER_PASS="your-secure-passphrase"
3. Configure PostgreSQL
Edit your postgresql.conf to enable archiving. pgBackRest needs to ship Write-Ahead Logs (WAL) to the repository to allow for consistency and PITR.
# postgresql.conf
archive_mode = on
archive_command = 'pgbackrest --stanza=main archive-push %p'
archive_timeout = 60
Restart PostgreSQL to apply changes:
sudo systemctl restart postgresql
Step 2: Initialize the Stanza
A "stanza" is the configuration definition for a specific database cluster. We need to create it and verify the configuration.
-
Create the Stanza:
sudo -u postgres pgbackrest --stanza=main stanza-create -
Check Configuration: This command verifies that pgBackRest can read the config, access the database, and write to the S3 bucket.
sudo -u postgres pgbackrest --stanza=main checkOutput should end with:
check command end: completed successfully
Step 3: Running Backups
Now that everything is configured, let's run our first backups.
1. Full Backup
Always start with a full backup. This copies the entire database.
sudo -u postgres pgbackrest --stanza=main --type=full backup
2. Differential Backup
Backs up all changes since the last full backup. faster and smaller.
sudo -u postgres pgbackrest --stanza=main --type=diff backup
3. Incremental Backup
Backs up changes since the last backup (full or diff). Extremely fast.
sudo -u postgres pgbackrest --stanza=main --type=incr backup
4. View Backup Status
Check the status of your backups and available recovery points.
sudo -u postgres pgbackrest info
Step 4: Disaster Recovery (Restore)
The true test of a backup strategy is the restore process.
Scenario 1: Restore to Latest
If your database is corrupted and you just want the latest state:
-
Stop PostgreSQL:
sudo systemctl stop postgresql -
Restore:
# --delta forces restore even if files exist, overwriting changed files sudo -u postgres pgbackrest --stanza=main --delta restore -
Start PostgreSQL:
sudo systemctl start postgresql
Scenario 2: Point-in-Time Recovery (PITR)
Someone accidentally dropped a table at 2:30 PM. You need to go back to 2:29 PM.
-
Stop PostgreSQL:
sudo systemctl stop postgresql -
Restore to Timestamp:
sudo -u postgres pgbackrest --stanza=main --type=time --target="2023-11-23 14:29:00" \ --delta restore -
Start PostgreSQL:
sudo systemctl start postgresqlPostgreSQL will replay logs up to the exact second specified and then open for connections.
Conclusion
You now have a production-ready backup system. With pgBackRest, you're not just saving files; you're ensuring business continuity. Remember to:
- Monitor your backup logs.
- Test your restores regularly (e.g., on a staging server).
- Keep your S3 credentials secure.