Skip to content

Backup & Restore

PostgreSQL is the only thing you need to back up. Meilisearch rebuilds from PostgreSQL via the backfill API, Valkey is an ephemeral cache, and Redpanda events are replayable. Back up PostgreSQL regularly or you’ll lose tenant data on disk failure.

ComponentBackup RequiredMethod
PostgreSQLYespg_dump
MeilisearchNoRebuilt from PostgreSQL via the backfill API
Valkey (Redis)NoEphemeral cache, repopulated on restart
Redpanda (Kafka)OptionalEvents are replayable; back up only if retention matters
Terminal window
export DATABASE_URL="postgresql://shoehorn_user:password@host:5432/shoehorn"
pg_dump "$DATABASE_URL" \
--format=custom \
--no-owner \
--no-privileges \
--file="shoehorn_$(date +%Y%m%d_%H%M%S).dump"

Use --format=custom for compression and selective restore, --no-owner for portability across database users, and --no-privileges to skip GRANTs that won’t apply to the target cluster.

Terminal window
export DATABASE_URL="postgresql://shoehorn_user:password@host:5432/shoehorn"
pg_restore \
--dbname="$DATABASE_URL" \
--clean --if-exists \
--no-owner --no-privileges \
shoehorn_20260324_120000.dump

--clean --if-exists drops and recreates objects during restore. Expect “relation already exists” warnings on first-time clean. They’re safe to ignore.

apiVersion: batch/v1
kind: CronJob
metadata:
name: shoehorn-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: postgres:18-alpine
command:
- /bin/sh
- -c
- |
pg_dump "$DATABASE_URL" \
--format=custom --no-owner --no-privileges \
--file="/backups/shoehorn_$(date +%Y%m%d_%H%M%S).dump"
find /backups -name "shoehorn_*.dump" -mtime +14 -delete
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: database-credentials
key: backup_url
volumeMounts:
- name: backups
mountPath: /backups
restartPolicy: OnFailure
volumes:
- name: backups
persistentVolumeClaim:
claimName: shoehorn-backups
Terminal window
# crontab -e
0 2 * * * DATABASE_URL="postgresql://shoehorn_user:pass@host:5432/shoehorn" \
pg_dump "$DATABASE_URL" --format=custom --no-owner --no-privileges \
--file=/var/backups/shoehorn/shoehorn_$(date +\%Y\%m\%d_\%H\%M\%S).dump \
>> /var/log/shoehorn-backup.log 2>&1

After restoring a backup:

  1. Re-run migrations (if upgrading): docker run shoehorn-api --migrate-only.
  2. Rebuild the search index: call the backfill API. Restarting the API does not re-index. You need this call.
    Terminal window
    curl -X POST https://your-domain/api/v1/admin/backfill \
    -H "Authorization: Bearer <admin-token>"
    See Search Backfill for status checks and selective re-indexing.
  3. Clear cache: restart Valkey or wait for TTL expiry.
  4. Verify RLS policies are intact: psql "$DATABASE_URL" -c "SELECT schemaname, tablename, rowsecurity FROM pg_tables WHERE schemaname='public' AND rowsecurity=true;". Every tenant table should be listed.
  • Backups running on schedule (check logs)
  • Backups stored off-host (cloud storage, NAS, different server)
  • Test restore quarterly (restore to staging, verify data)
  • DATABASE_URL and credentials in a secret manager, never in the backup directory
  • RLS policies verified after restore

The backup was created with a different database user. Use --no-owner:

Terminal window
pg_restore --dbname="$DATABASE_URL" --no-owner --no-privileges backup.dump

For databases over 10 GB:

  • Parallel dump: pg_dump --jobs=4 (requires directory format).
  • WAL archiving: continuous backup via PostgreSQL WAL for point-in-time recovery.
  • pg_basebackup: physical backup for very large databases.