Upgrading Shoehorn

Upgrade Process

1. Review Release Notes

Before upgrading, check the release notes for:

Breaking changes
New required environment variables
Database migration requirements
Deprecated features

2. Backup Database

# Using pg_dump
pg_dump -h <host> -U shoehorn_user -d shoehorn > backup_$(date +%Y%m%d).sql

# Or using kubectl
kubectl exec -n shoehorn deploy/postgresql -- \
  pg_dump -U shoehorn_user shoehorn > backup.sql

3. Update Helm Values

Update the image tag in your values file:

image:
  tag: "0.7.0"  # New version

4. Run Upgrade

helm upgrade shoehorn oci://ghcr.io/shoehorn-dev/helm-charts/shoehorn \
  --namespace shoehorn \
  -f values.yaml

Helm will:

Apply new container images
Run database migrations automatically (init container)
Rolling-restart services with zero downtime

PostgreSQL is not restarted by helm upgrade. See PostgreSQL upgrades below.

5. Verify

# Check rollout status
kubectl rollout status -n shoehorn deploy/api

# Check API health
kubectl port-forward -n shoehorn svc/shoehorn-api 8080:8080
curl http://localhost:8080/healthz

# Check logs for migration errors
kubectl logs -n shoehorn -l app=api --tail=100

PostgreSQL upgrades

The postgres StatefulSet uses updateStrategy: OnDelete and pins its image in values.yaml (e.g. shoehorned/shoehorn-postgres:v18.3-pgaudit-1.0). helm upgrade won’t restart the pod even if the chart bumps the tag. Roll it deliberately:

kubectl get sts -n shoehorn shoehorn-postgresql -o jsonpath='{.spec.template.spec.containers[0].image}'
kubectl delete pod -n shoehorn shoehorn-postgresql-0

The PVC stays attached. Data is preserved.

The StatefulSet, PVC, and Service carry helm.sh/resource-policy: keep. helm uninstall leaves them in place. Delete them by hand for a clean slate.

For an external database, set postgresql.enabled: false and point DATABASE_URL and MIGRATION_DATABASE_URL at your server.

Enabling Redpanda persistence

If you previously ran Redpanda without persistence (redpanda.persistence.enabled: false, data on an emptyDir) and switch it on, a plain helm upgrade fails. A StatefulSet’s volume claim template can’t be added after creation, so the API server rejects the change.

Delete the StatefulSet first, then upgrade. The emptyDir only held in-flight events, so this loses nothing a normal restart wouldn’t:

kubectl delete statefulset shoehorn-redpanda -n shoehorn
helm upgrade shoehorn ... --values custom-values.yaml --wait

Redpanda comes back on a PVC and keeps its data across restarts from then on. The api, worker, crawler, and forge pods get connection errors during the recreate and reconnect once it’s ready.

Meilisearch upgrades

Meilisearch won’t start when its data directory was written by an older version. When a platform release bumps the bundled search engine, the Meilisearch pod boots the new binary on the old volume and crashloops with a “database version is incompatible” error until you migrate the data.

The chart migrates in place with the --experimental-dumpless-upgrade flag, passed through meilisearch.extraArgs. The migration rewrites the data directory and a plain image downgrade won’t undo it, so back up first.

1. Back up. Trigger a Meilisearch snapshot, and snapshot the PVC (data-shoehorn-meilisearch-0) if your storage class supports VolumeSnapshots.

kubectl -n shoehorn port-forward svc/shoehorn-meilisearch 7700:7700 &
curl -X POST http://127.0.0.1:7700/snapshots -H "Authorization: Bearer $MEILI_MASTER_KEY"

2. Turn on the flag and raise the startup budget. A large index can take minutes to migrate, and the default startup probe kills the pod after about 150s.

meilisearch:
  extraArgs:
    - --experimental-dumpless-upgrade
  startupProbe:
    httpGet: { path: /health, port: 7700 }
    initialDelaySeconds: 10
    periodSeconds: 5
    failureThreshold: 180   # ~15 min, revert after the upgrade
    timeoutSeconds: 3

3. Upgrade and watch the pod come back on the new version.

helm upgrade shoehorn oci://ghcr.io/shoehorn-dev/helm-charts/shoehorn \
  --namespace shoehorn -f values.yaml --wait

kubectl -n shoehorn logs -f shoehorn-meilisearch-0
kubectl -n shoehorn get pod shoehorn-meilisearch-0 -w

Run a search in the app to confirm results come back.

4. Clear the flag. Once the pod is healthy the migration is done. Remove meilisearch.extraArgs, revert the startupProbe override, and upgrade again to keep the flag from running on every restart.

If the pod won’t come up, roll back with helm rollback and restore Meilisearch from the snapshot. For the full step-by-step, see Upgrading Meilisearch in the chart README.

Rollback

If issues occur after upgrading:

# Rollback to previous Helm release
helm rollback shoehorn -n shoehorn

# Or rollback to a specific revision
helm history shoehorn -n shoehorn
helm rollback shoehorn <revision-number> -n shoehorn

Database migrations are forward-only. If a migration must be reverted, apply the corresponding down migration manually:

kubectl exec -n shoehorn deploy/api -- \
  ./migrate -path /migrations -database "$MIGRATION_DATABASE_URL" down 1

K8s Agent Upgrades

The K8s agent can be upgraded independently of the main platform:

helm upgrade shoehorn-k8s-agent oci://ghcr.io/shoehorn-dev/helm-charts/shoehorn-k8s-agent \
  --namespace shoehorn-agent \
  -f agent-values.yaml

The agent is backward-compatible with older API versions. Upgrade the platform first, then agents.

Things to watch during an agent upgrade:

Leader re-election: The rolling restart triggers a new leader election. There is a brief gap (a few seconds) between the old leader stopping and the new leader starting. No data is lost — events are buffered in the Kubernetes informer cache.
HA deployments: With a PodDisruptionBudget in place, at least one replica stays available throughout the upgrade. A follower takes over leader election within seconds.
Configuration validation: If the new agent version introduces stricter configuration parsing, invalid environment variables that were previously silently accepted may cause startup failures. Check pod logs if a new version does not become ready.

See the shoehorn-dev/helm-charts repository for the chart changelog and values reference.