Postgres Sync to Development

🧩 Overall Architecture

  • Client Server (production):
    • Runs PostgreSQL
    • Has pgBackRest configured
    • Pushes backups to S3 (preferred) or exposes backup directory for sync
  • Your Development Server:
    • Pulls backup data from S3 or via rsync/scp
    • Restores to a clean local PostgreSQL instance
    • Discards all dev changes

βœ… Part 1: Setup on Client Production Server

1.1 Install pgBackRest

bash

CopyEdit

sudo apt update sudo apt install pgbackrest -y

1.2 Create pgBackRest Config File

Edit /etc/pgbackrest/pgbackrest.conf:

ini

CopyEdit

[global] repo1-path=/var/lib/pgbackrest repo1-retention-full=2 compress-type=zstd log-level-console=info process-max=4 [client_a] pg1-path=/var/lib/postgresql/15/main

Replace client_a with your client-specific label, and PostgreSQL version path if different.

1.3 Create and Secure Backup Folder

bash

CopyEdit

sudo mkdir -p /var/lib/pgbackrest sudo chown postgres:postgres /var/lib/pgbackrest

1.4 Enable WAL Archiving (Add to postgresql.conf)

c

CopyEdit

archive_mode = on archive_command = 'pgbackrest --stanza=client_a archive-push %p' wal_level = replica

Reload PostgreSQL:

bash

CopyEdit

sudo systemctl reload postgresql

1.5 Create Stanza and Run First Full Backup

bash

CopyEdit

sudo -u postgres pgbackrest --stanza=client_a stanza-create sudo -u postgres pgbackrest --stanza=client_a --type=full backup

1.6 Automate Incremental Backups (via Cron)

bash

CopyEdit

crontab -e -u postgres

Add:

c

CopyEdit

0 2 * * 0 pgbackrest --stanza=client_a backup

This runs every Sunday.

βœ… Part 2: Configure S3 Backup (Optional but Recommended)

2.1 Install S3 Tools (or use IAM Role if on AWS)

bash

CopyEdit

sudo apt install awscli

Set up credentials:

bash

CopyEdit

aws configure

Or use a MinIO bucket.

2.2 Edit /etc/pgbackrest/pgbackrest.conf

ini

CopyEdit

[global] repo1-type=s3 repo1-s3-bucket=your-s3-bucket-name repo1-s3-endpoint=s3.amazonaws.com repo1-s3-key=ACCESS_KEY repo1-s3-key-secret=SECRET_KEY compress-type=zstd repo1-retention-full=2 [client_a] pg1-path=/var/lib/postgresql/15/main

Restart PostgreSQL and re-run stanza-create and backup.

βœ… Part 3: Setup on Your Development Server

3.1 Install pgBackRest

bash

CopyEdit

sudo apt update sudo apt install pgbackrest -y

3.2 Sync Backup Repo

Option A: From S3

bash

CopyEdit

aws s3 sync s3://your-s3-bucket /var/lib/pgbackrest

Option B: From SCP (client push)

From client:

bash

CopyEdit

scp -r /var/lib/pgbackrest devuser@your-dev-ip:/var/lib/pgbackrest

3.3 Drop and Restore Local DB

Stop PostgreSQL:

bash

CopyEdit

sudo systemctl stop postgresql

Drop DB folder:

bash

CopyEdit

sudo rm -rf /var/lib/postgresql/15/main/*

Restore:

bash

CopyEdit

pgbackrest --stanza=client_a restore

Start PostgreSQL:

bash

CopyEdit

sudo systemctl start postgresql

Now your dev DB is fully synced with client production.

3.4 (Optional) Add Script for Weekly Restore

bash

CopyEdit

#!/bin/bash echo "[*] Syncing from S3..." aws s3 sync s3://your-s3-bucket /var/lib/pgbackrest echo "[*] Dropping DB..." sudo systemctl stop postgresql sudo rm -rf /var/lib/postgresql/15/main/* echo "[*] Restoring from backup..." pgbackrest --stanza=client_a restore sudo systemctl start postgresql echo "[βœ“] Database restored from latest client_a backup."

Make it executable:

bash

CopyEdit

chmod +x /opt/dbsync/restore_client_a.sh

Then schedule in cron weekly.

🧠 Odoo-Specific Enhancements

After restore:

bash

CopyEdit

./odoo-bin -d client_a_db \ --without-demo=all \ --dev=all \ --log-level=debug_sql

You can even automate this with a Makefile, Fabric, or Docker entrypoint.

πŸ§ͺ Summary

ComponentClient Prod ServerYour Dev Server
Backup ToolpgBackRestpgBackRest
Backup StorageLocal or S3Pull from S3 or receive via SCP
Restore FrequencyWeekly (cron)Weekly (script/cron)
Dev WorkflowClean restore β†’ no local changesFull restore β†’ Odoo --without-demo

Would you like a working docker-compose.yml for PostgreSQL with pgbackrest built in and ready for restore? Or support for multiple clients from one config?

You said:

will it fetch whole db every week or only the incremental data to reduce the data download

ChatGPT said:

Great follow-up β€” here's a clear breakdown of how much data is transferred weekly when using pgBackRest and how to minimize bandwidth with incremental backups:

🧠 pgBackRest Backup Types Recap

Backup TypeDescriptionSize
FullFull copy of DB clusterβœ… Required as base
DifferentialOnly changed files since last full backup🟑 Smaller
IncrementalOnly changed files since last full or incremental🟒 Smallest

βœ… What Happens Each Week?

Client Side (Production Server):

  • Weekly cron runs pgbackrest --stanza=client_a backup
  • It creates an incremental or differential backup (based on config or parameters).
  • Only changed PostgreSQL files (heap, index, WAL) are backed up.

βœ… You do not send the entire DB again. Only changes go into the backup.

Your Dev Side (Restoration):

Here is the important part:

πŸ” You still need the full + incrementals to restore.

  • So on first run, yes β€” you download the full backup (maybe 10–50 GB compressed).
  • After that:
    • You only sync new delta backups from S3 or SCP (e.g., 500 MB – 5 GB).
    • You restore from the most recent full + chain of incrementals.

βœ… How to Sync Only Changed Files:

bash

CopyEdit

aws s3 sync s3://your-client-bucket /var/lib/pgbackrest

or

bash

CopyEdit

rsync -av --delete client:/var/lib/pgbackrest/ /var/lib/pgbackrest/

These tools compare file timestamps and hashes β€” so only new or changed files are downloaded.

πŸš€ How to Maximize Efficiency

Client Config Example

/etc/pgbackrest/pgbackrest.conf:

ini

CopyEdit

[global] repo1-retention-full=2 repo1-retention-diff=6 start-fast=y compress-type=zstd process-max=4 repo1-path=/var/lib/pgbackrest [client_a] pg1-path=/var/lib/postgresql/15/main

Result:

  • Full backup on Sunday
  • Incrementals Mon–Sat
  • Very low bandwidth/day
  • Weekly restore needs only 1 full + 6 incrementals

Your Weekly Restore Script

bash

CopyEdit

aws s3 sync s3://client-a-bucket /var/lib/pgbackrest sudo systemctl stop postgresql sudo rm -rf /var/lib/postgresql/15/main/* pgbackrest --stanza=client_a restore sudo systemctl start postgresql

βœ… This will reconstruct the DB using only changed files (WAL + heap) from the backup chain.

πŸ“¦ Summary

QuestionAnswer
Does it fetch full DB weekly?❌ Not after first time
How much data weekly?Only incremental delta files (~500MB–5GB)
Do I need to keep old backups?Only latest full + chain of incrementals
Is bandwidth usage optimized?βœ… Yes, using aws sync or rsync
Is local dev DB overwritten?βœ… Fully, unless you use --delta restore

Would you like a real-life example using:

  • MinIO/S3 bucket + daily cron on client
  • Automatic restore with health-check on dev

I can also provide a Makefile or bash pipeline.

You said:

sudo rm -rf /var/lib/postgresql/15/main/* this will delete my db data, if there are multiple dbs from different customers then ?? and how only the changed files are download, some concept clarification is required.

ChatGPT said:

Excellent questions β€” let’s clarify both your concern about multiple databases and the concept of how pgBackRest handles only changed files.

❗️1. Concern: rm -rf /var/lib/postgresql/15/main/* Deletes All Databases?

βœ… Yes β€” it deletes all databases in that PostgreSQL cluster.

PostgreSQL stores all databases for one cluster in a single data directory, typically:

swift

CopyEdit

/var/lib/postgresql/15/main/

So if you're running multiple clients’ databases in a single PostgreSQL cluster, this command wipes all of them.

🧩 Solution: One Cluster per Client

To manage this cleanly, structure your dev server like this:

ClientCluster PortData Directory
Client A5433/var/lib/postgresql/15/client_a/
Client B5434/var/lib/postgresql/15/client_b/
Client C5435/var/lib/postgresql/15/client_c/

Then your restore becomes:

bash

CopyEdit

sudo systemctl stop postgresql@15-clienta sudo rm -rf /var/lib/postgresql/15/client_a/* pgbackrest --stanza=client_a restore sudo systemctl start postgresql@15-clienta

Each client’s data is isolated, and restoring one won’t affect the others.

πŸ” 2. Concept: How pgBackRest Downloads Only Changed Files

Let’s demystify how incremental backup + restore works.

πŸ”Ή PostgreSQL Storage

  • Each table/index is stored in a binary file (e.g., base/16384/24576)
  • When a record is updated, only that file is modified.
  • Postgres also writes WAL (Write Ahead Logs) to track changes.

πŸ”Ή What pgBackRest Does

➀ During Backup:

  • Scans all files in PGDATA (/var/lib/postgresql/15/main)
  • Checks modification time, size, checksums
  • Uploads only new or changed files to S3 or local repo
  • Keeps track of file history per backup in a manifest

πŸ”„ Incremental backup = only changed files + required WAL logs

➀ During Restore:

  • Reads the manifest of full + incrementals
  • Reconstructs the full data directory
    • Uses unchanged files from previous full
    • Merges in changed files from incrementals
  • Downloads only files needed

βœ… Think of it like a binary-level rsync for PostgreSQL, managed safely.

🧠 Why This Matters

  • Saves time & bandwidth: if only 5% of data changes weekly, only that 5% is uploaded and synced.
  • Ensures safe, consistent restore: uses PostgreSQL-safe logic and checksums.
  • You don’t need to track which tables changed β€” pgBackRest handles it at file level.

βœ… TL;DR - What You Should Do

  1. Separate cluster per client on dev machine.
  2. Configure pgBackRest with separate stanza + data dir per client.
  3. On client prod server:
    • Set up full + incremental backup to S3 or SSH target
  4. On dev server:
    • Weekly sync with aws s3 sync or rsync
    • rm -rf only inside that client's isolated data dir
    • Run pgbackrest restore for that stanza
  5. Optional:
    • Use pg_ctlcluster or systemd templates for ported multi-cluster control.

Would you like me to generate:

  • A sample pgbackrest.conf layout for multiple clients
  • Systemd overrides for cluster isolation
  • A sync+restore bash script with client parameter?

Let me know how deep you want to go!

You said:

A sample pgbackrest.conf layout for multiple clients Systemd overrides for cluster isolation A sync+restore bash script with client parameter?

ChatGPT said:

Great! Let’s walk through a full working setup tailored for multi-client PostgreSQL clusters using pgBackRest, with the following components:

🧾 1. pgbackrest.conf: Multi-Client Configuration

Create a single /etc/pgbackrest/pgbackrest.conf file that contains separate stanzas per client:

ini

CopyEdit

[global] repo1-path=/var/lib/pgbackrest repo1-retention-full=2 repo1-retention-diff=7 start-fast=y log-level-console=info log-level-file=debug process-max=4 compress-level=3 # S3 Storage Example repo1-type=s3 repo1-s3-bucket=pgbackrest-client-backups repo1-s3-region=us-east-1 repo1-s3-endpoint=s3.amazonaws.com repo1-s3-verify-ssl=n repo1-s3-key=YOUR_S3_ACCESS_KEY repo1-s3-key-secret=YOUR_S3_SECRET_KEY # Client A [client_a] pg1-path=/var/lib/postgresql/15/client_a pg1-port=5433 pg1-user=postgres # Client B [client_b] pg1-path=/var/lib/postgresql/15/client_b pg1-port=5434 pg1-user=postgres # Client C [client_c] pg1-path=/var/lib/postgresql/15/client_c pg1-port=5435 pg1-user=postgres

πŸ›  2. Systemd Unit Override for Cluster Isolation

Use separate PostgreSQL clusters per client. For example:

bash

CopyEdit

sudo pg_createcluster 15 client_a -p 5433 -- --data-directory=/var/lib/postgresql/15/client_a sudo pg_createcluster 15 client_b -p 5434 -- --data-directory=/var/lib/postgresql/15/client_b

Start/Stop per client:

bash

CopyEdit

sudo systemctl start postgresql@15-client_a sudo systemctl stop postgresql@15-client_b

πŸ’‘ Optional systemd override:

bash

CopyEdit

sudo systemctl edit postgresql@15-client_a

Add:

ini

CopyEdit

[Service] Environment=PGDATA=/var/lib/postgresql/15/client_a

πŸ” 3. Sync + Restore Bash Script (Client-Based)

Save as sync_restore_client.sh and make it executable.

bash

CopyEdit

#!/bin/bash CLIENT=$1 if [[ -z "$CLIENT" ]]; then echo "Usage: $0 client_a|client_b|client_c" exit 1 fi STANZA=$CLIENT PGDATA="/var/lib/postgresql/15/${CLIENT}" PORT=$(grep -Po "(?<=pg1-port=)\d+" /etc/pgbackrest/pgbackrest.conf | grep -m1 -A1 "$CLIENT") echo "▢️ Stopping PostgreSQL cluster for $CLIENT..." sudo systemctl stop postgresql@15-$CLIENT echo "🧹 Cleaning data directory: $PGDATA" sudo rm -rf "$PGDATA"/* echo "πŸ”„ Running pgBackRest restore for $STANZA..." pgbackrest --stanza="$STANZA" restore --delta --type=default echo "πŸš€ Starting PostgreSQL cluster again..." sudo systemctl start postgresql@15-$CLIENT echo "βœ… Restore complete for $CLIENT"

To run:

bash

CopyEdit

./sync_restore_client.sh client_b

πŸ“¦ 4. Automating Weekly Pull from S3 (Optional)

bash

CopyEdit

#!/bin/bash # AWS CLI pull + restore aws s3 sync s3://pgbackrest-client-backups /var/lib/pgbackrest # Loop through clients for CLIENT in client_a client_b client_c; do ./sync_restore_client.sh $CLIENT done

βœ… Summary

ComponentDescription
pgbackrest.confDefines separate stanza for each client
PostgreSQL ClustersEach client has isolated port + data dir
Systemd ServiceAllows start/stop per cluster (15-client_a)
Sync ScriptHandles S3 pull, restore, and restart
Delta RestoreDownloads and restores only changed files

Would you like me to generate:

  • An S3 bucket policy for secure backups?
  • A cronjob template to automate weekly restore?