Postgres Sync to Development

🧩 Overall Architecture

Client Server (production):
- Runs PostgreSQL
- Has pgBackRest configured
- Pushes backups to S3 (preferred) or exposes backup directory for sync
Your Development Server:
- Pulls backup data from S3 or via rsync/scp
- Restores to a clean local PostgreSQL instance
- Discards all dev changes

✅ Part 1: Setup on Client Production Server

1.1 Install pgBackRest

bash

CopyEdit

sudo apt update sudo apt install pgbackrest -y

1.2 Create pgBackRest Config File

Edit /etc/pgbackrest/pgbackrest.conf:

ini
CopyEdit
[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2
compress-type=zstd
log-level-console=info
process-max=4

[client_a]
pg1-path=/var/lib/postgresql/15/main

Replace client_a with your client-specific label, and PostgreSQL version path if different.

1.3 Create and Secure Backup Folder

bash

CopyEdit

sudo mkdir -p /var/lib/pgbackrest sudo chown postgres:postgres /var/lib/pgbackrest

1.4 Enable WAL Archiving (Add to postgresql.conf)

c
CopyEdit
archive_mode = on
archive_command = 'pgbackrest --stanza=client_a archive-push %p'
wal_level = replica

Reload PostgreSQL:

bash

CopyEdit

sudo systemctl reload postgresql

1.5 Create Stanza and Run First Full Backup

bash

CopyEdit

sudo -u postgres pgbackrest --stanza=client_a stanza-create sudo -u postgres pgbackrest --stanza=client_a --type=full backup

1.6 Automate Incremental Backups (via Cron)

bash

CopyEdit

crontab -e -u postgres

Add:

CopyEdit

0 2 * * 0 pgbackrest --stanza=client_a backup

This runs every Sunday.

✅ Part 2: Configure S3 Backup (Optional but Recommended)

2.1 Install S3 Tools (or use IAM Role if on AWS)

bash

CopyEdit

sudo apt install awscli

Set up credentials:

bash

CopyEdit

aws configure

Or use a MinIO bucket.

2.2 Edit /etc/pgbackrest/pgbackrest.conf

ini
CopyEdit
[global]
repo1-type=s3
repo1-s3-bucket=your-s3-bucket-name
repo1-s3-endpoint=s3.amazonaws.com
repo1-s3-key=ACCESS_KEY
repo1-s3-key-secret=SECRET_KEY
compress-type=zstd
repo1-retention-full=2

[client_a]
pg1-path=/var/lib/postgresql/15/main

Restart PostgreSQL and re-run stanza-create and backup.

✅ Part 3: Setup on Your Development Server

3.1 Install pgBackRest

bash

CopyEdit

sudo apt update sudo apt install pgbackrest -y

3.2 Sync Backup Repo

Option A: From S3

bash

CopyEdit

aws s3 sync s3://your-s3-bucket /var/lib/pgbackrest

Option B: From SCP (client push)

From client:

bash

CopyEdit

scp -r /var/lib/pgbackrest devuser@your-dev-ip:/var/lib/pgbackrest

3.3 Drop and Restore Local DB

Stop PostgreSQL:

bash

CopyEdit

sudo systemctl stop postgresql

Drop DB folder:

bash

CopyEdit

sudo rm -rf /var/lib/postgresql/15/main/*

Restore:

bash

CopyEdit

pgbackrest --stanza=client_a restore

Start PostgreSQL:

bash

CopyEdit

sudo systemctl start postgresql

Now your dev DB is fully synced with client production.

3.4 (Optional) Add Script for Weekly Restore

bash

CopyEdit

#!/bin/bash echo "[*] Syncing from S3..." aws s3 sync s3://your-s3-bucket /var/lib/pgbackrest echo "[*] Dropping DB..." sudo systemctl stop postgresql sudo rm -rf /var/lib/postgresql/15/main/* echo "[*] Restoring from backup..." pgbackrest --stanza=client_a restore sudo systemctl start postgresql echo "[✓] Database restored from latest client_a backup."

Make it executable:

bash

CopyEdit

chmod +x /opt/dbsync/restore_client_a.sh

Then schedule in cron weekly.

🧠 Odoo-Specific Enhancements

After restore:

bash
CopyEdit
./odoo-bin -d client_a_db \
  --without-demo=all \
  --dev=all \
  --log-level=debug_sql

You can even automate this with a Makefile, Fabric, or Docker entrypoint.

🧪 Summary

Component	Client Prod Server	Your Dev Server
Backup Tool	pgBackRest	pgBackRest
Backup Storage	Local or S3	Pull from S3 or receive via SCP
Restore Frequency	Weekly (cron)	Weekly (script/cron)
Dev Workflow	Clean restore → no local changes	Full restore → Odoo --without-demo

Would you like a working docker-compose.yml for PostgreSQL with pgbackrest built in and ready for restore? Or support for multiple clients from one config?

You said:

will it fetch whole db every week or only the incremental data to reduce the data download

ChatGPT said:

Great follow-up — here's a clear breakdown of how much data is transferred weekly when using pgBackRest and how to minimize bandwidth with incremental backups:

🧠 pgBackRest Backup Types Recap

Backup Type	Description	Size
Full	Full copy of DB cluster	✅ Required as base
Differential	Only changed files since last full backup	🟡 Smaller
Incremental	Only changed files since last full or incremental	🟢 Smallest

✅ What Happens Each Week?

Client Side (Production Server):

Weekly cron runs pgbackrest --stanza=client_a backup
It creates an incremental or differential backup (based on config or parameters).
Only changed PostgreSQL files (heap, index, WAL) are backed up.

✅ You do not send the entire DB again. Only changes go into the backup.

Your Dev Side (Restoration):

Here is the important part:

🔁 You still need the full + incrementals to restore.

So on first run, yes — you download the full backup (maybe 10–50 GB compressed).
After that:
- You only sync new delta backups from S3 or SCP (e.g., 500 MB – 5 GB).
- You restore from the most recent full + chain of incrementals.

✅ How to Sync Only Changed Files:

bash

CopyEdit

aws s3 sync s3://your-client-bucket /var/lib/pgbackrest

bash

CopyEdit

rsync -av --delete client:/var/lib/pgbackrest/ /var/lib/pgbackrest/

These tools compare file timestamps and hashes — so only new or changed files are downloaded.

🚀 How to Maximize Efficiency

Client Config Example

/etc/pgbackrest/pgbackrest.conf:

ini
CopyEdit
[global]
repo1-retention-full=2
repo1-retention-diff=6
start-fast=y
compress-type=zstd
process-max=4
repo1-path=/var/lib/pgbackrest

[client_a]
pg1-path=/var/lib/postgresql/15/main

Result:

Full backup on Sunday

Incrementals Mon–Sat

Very low bandwidth/day

Weekly restore needs only 1 full + 6 incrementals

Your Weekly Restore Script

bash

CopyEdit

aws s3 sync s3://client-a-bucket /var/lib/pgbackrest sudo systemctl stop postgresql sudo rm -rf /var/lib/postgresql/15/main/* pgbackrest --stanza=client_a restore sudo systemctl start postgresql

✅ This will reconstruct the DB using only changed files (WAL + heap) from the backup chain.

📦 Summary

Question	Answer
Does it fetch full DB weekly?	❌ Not after first time
How much data weekly?	Only incremental delta files (~500MB–5GB)
Do I need to keep old backups?	Only latest full + chain of incrementals
Is bandwidth usage optimized?	✅ Yes, using aws sync or rsync
Is local dev DB overwritten?	✅ Fully, unless you use --delta restore

Would you like a real-life example using:

MinIO/S3 bucket + daily cron on client
Automatic restore with health-check on dev

I can also provide a Makefile or bash pipeline.

You said:

sudo rm -rf /var/lib/postgresql/15/main/* this will delete my db data, if there are multiple dbs from different customers then ?? and how only the changed files are download, some concept clarification is required.

ChatGPT said:

Excellent questions — let’s clarify both your concern about multiple databases and the concept of how pgBackRest handles only changed files.

❗️1. Concern: rm -rf /var/lib/postgresql/15/main/* Deletes All Databases?

✅ Yes — it deletes all databases in that PostgreSQL cluster.

PostgreSQL stores all databases for one cluster in a single data directory, typically:

swift

CopyEdit

/var/lib/postgresql/15/main/

So if you're running multiple clients’ databases in a single PostgreSQL cluster, this command wipes all of them.

🧩 Solution: One Cluster per Client

To manage this cleanly, structure your dev server like this:

Client	Cluster Port	Data Directory
Client A	5433	/var/lib/postgresql/15/client_a/
Client B	5434	/var/lib/postgresql/15/client_b/
Client C	5435	/var/lib/postgresql/15/client_c/

Then your restore becomes:

bash

CopyEdit

sudo systemctl stop postgresql@15-clienta sudo rm -rf /var/lib/postgresql/15/client_a/* pgbackrest --stanza=client_a restore sudo systemctl start postgresql@15-clienta

Each client’s data is isolated, and restoring one won’t affect the others.

🔁 2. Concept: How pgBackRest Downloads Only Changed Files

Let’s demystify how incremental backup + restore works.

🔹 PostgreSQL Storage

Each table/index is stored in a binary file (e.g., base/16384/24576)
When a record is updated, only that file is modified.
Postgres also writes WAL (Write Ahead Logs) to track changes.

🔹 What pgBackRest Does

➤ During Backup:

Scans all files in PGDATA (/var/lib/postgresql/15/main)
Checks modification time, size, checksums
Uploads only new or changed files to S3 or local repo
Keeps track of file history per backup in a manifest

🔄 Incremental backup = only changed files + required WAL logs

➤ During Restore:

Reads the manifest of full + incrementals
Reconstructs the full data directory
- Uses unchanged files from previous full
- Merges in changed files from incrementals
Downloads only files needed

✅ Think of it like a binary-level rsync for PostgreSQL, managed safely.

🧠 Why This Matters

Saves time & bandwidth: if only 5% of data changes weekly, only that 5% is uploaded and synced.
Ensures safe, consistent restore: uses PostgreSQL-safe logic and checksums.
You don’t need to track which tables changed — pgBackRest handles it at file level.

✅ TL;DR - What You Should Do

Separate cluster per client on dev machine.
Configure pgBackRest with separate stanza + data dir per client.
On client prod server:
- Set up full + incremental backup to S3 or SSH target
On dev server:
- Weekly sync with aws s3 sync or rsync
- rm -rf only inside that client's isolated data dir
- Run pgbackrest restore for that stanza
Optional:
- Use pg_ctlcluster or systemd templates for ported multi-cluster control.

Would you like me to generate:

A sample pgbackrest.conf layout for multiple clients
Systemd overrides for cluster isolation
A sync+restore bash script with client parameter?

Let me know how deep you want to go!

You said:

A sample pgbackrest.conf layout for multiple clients Systemd overrides for cluster isolation A sync+restore bash script with client parameter?

ChatGPT said:

Great! Let’s walk through a full working setup tailored for multi-client PostgreSQL clusters using pgBackRest, with the following components:

🧾 1. pgbackrest.conf: Multi-Client Configuration

Create a single /etc/pgbackrest/pgbackrest.conf file that contains separate stanzas per client:

ini
CopyEdit
[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2
repo1-retention-diff=7
start-fast=y
log-level-console=info
log-level-file=debug
process-max=4
compress-level=3

# S3 Storage Example
repo1-type=s3
repo1-s3-bucket=pgbackrest-client-backups
repo1-s3-region=us-east-1
repo1-s3-endpoint=s3.amazonaws.com
repo1-s3-verify-ssl=n
repo1-s3-key=YOUR_S3_ACCESS_KEY
repo1-s3-key-secret=YOUR_S3_SECRET_KEY

# Client A
[client_a]
pg1-path=/var/lib/postgresql/15/client_a
pg1-port=5433
pg1-user=postgres

# Client B
[client_b]
pg1-path=/var/lib/postgresql/15/client_b
pg1-port=5434
pg1-user=postgres

# Client C
[client_c]
pg1-path=/var/lib/postgresql/15/client_c
pg1-port=5435
pg1-user=postgres

🛠 2. Systemd Unit Override for Cluster Isolation

Use separate PostgreSQL clusters per client. For example:

bash

CopyEdit

sudo pg_createcluster 15 client_a -p 5433 -- --data-directory=/var/lib/postgresql/15/client_a sudo pg_createcluster 15 client_b -p 5434 -- --data-directory=/var/lib/postgresql/15/client_b

Start/Stop per client:

bash

CopyEdit

sudo systemctl start postgresql@15-client_a sudo systemctl stop postgresql@15-client_b

💡 Optional systemd override:

bash

CopyEdit

sudo systemctl edit postgresql@15-client_a

Add:

ini
CopyEdit
[Service]
Environment=PGDATA=/var/lib/postgresql/15/client_a

🔁 3. Sync + Restore Bash Script (Client-Based)

Save as sync_restore_client.sh and make it executable.

bash
CopyEdit
#!/bin/bash

CLIENT=$1

if [[ -z "$CLIENT" ]]; then
  echo "Usage: $0 client_a|client_b|client_c"
  exit 1
fi

STANZA=$CLIENT
PGDATA="/var/lib/postgresql/15/${CLIENT}"
PORT=$(grep -Po "(?<=pg1-port=)\d+" /etc/pgbackrest/pgbackrest.conf | grep -m1 -A1 "$CLIENT")

echo "▶️ Stopping PostgreSQL cluster for $CLIENT..."
sudo systemctl stop postgresql@15-$CLIENT

echo "🧹 Cleaning data directory: $PGDATA"
sudo rm -rf "$PGDATA"/*

echo "🔄 Running pgBackRest restore for $STANZA..."
pgbackrest --stanza="$STANZA" restore --delta --type=default

echo "🚀 Starting PostgreSQL cluster again..."
sudo systemctl start postgresql@15-$CLIENT

echo "✅ Restore complete for $CLIENT"

To run:

bash

CopyEdit

./sync_restore_client.sh client_b

📦 4. Automating Weekly Pull from S3 (Optional)

bash

CopyEdit

#!/bin/bash # AWS CLI pull + restore aws s3 sync s3://pgbackrest-client-backups /var/lib/pgbackrest # Loop through clients for CLIENT in client_a client_b client_c; do ./sync_restore_client.sh $CLIENT done

✅ Summary

Component	Description
pgbackrest.conf	Defines separate stanza for each client
PostgreSQL Clusters	Each client has isolated port + data dir
Systemd Service	Allows start/stop per cluster (15-client_a)
Sync Script	Handles S3 pull, restore, and restart
Delta Restore	Downloads and restores only changed files

Would you like me to generate:

An S3 bucket policy for secure backups?
A cronjob template to automate weekly restore?