Upgrading Shoehorn
Upgrade Process
Section titled “Upgrade Process”1. Review Release Notes
Section titled “1. Review Release Notes”Before upgrading, check the release notes for:
- Breaking changes
- New required environment variables
- Database migration requirements
- Deprecated features
2. Backup Database
Section titled “2. Backup Database”# Using pg_dumppg_dump -h <host> -U shoehorn_user -d shoehorn > backup_$(date +%Y%m%d).sql
# Or using kubectlkubectl exec -n shoehorn deploy/postgresql -- \ pg_dump -U shoehorn_user shoehorn > backup.sql3. Update Helm Values
Section titled “3. Update Helm Values”Update the image tag in your values file:
image: tag: "0.7.0" # New version4. Run Upgrade
Section titled “4. Run Upgrade”helm upgrade shoehorn oci://ghcr.io/shoehorn-dev/helm-charts/shoehorn \ --namespace shoehorn \ -f values.yamlHelm will:
- Apply new container images
- Run database migrations automatically (init container)
- Rolling-restart services with zero downtime
PostgreSQL is not restarted by helm upgrade. See PostgreSQL upgrades below.
5. Verify
Section titled “5. Verify”# Check rollout statuskubectl rollout status -n shoehorn deploy/api
# Check API healthkubectl port-forward -n shoehorn svc/shoehorn-api 8080:8080curl http://localhost:8080/healthz
# Check logs for migration errorskubectl logs -n shoehorn -l app=api --tail=100PostgreSQL upgrades
Section titled “PostgreSQL upgrades”The postgres StatefulSet uses updateStrategy: OnDelete and pins its image in values.yaml (e.g. shoehorned/shoehorn-postgres:v18.3-pgaudit-1.0). helm upgrade won’t restart the pod even if the chart bumps the tag. Roll it deliberately:
kubectl get sts -n shoehorn shoehorn-postgresql -o jsonpath='{.spec.template.spec.containers[0].image}'kubectl delete pod -n shoehorn shoehorn-postgresql-0The PVC stays attached. Data is preserved.
The StatefulSet, PVC, and Service carry helm.sh/resource-policy: keep. helm uninstall leaves them in place. Delete them by hand for a clean slate.
For an external database, set postgresql.enabled: false and point DATABASE_URL and MIGRATION_DATABASE_URL at your server.
Enabling Redpanda persistence
Section titled “Enabling Redpanda persistence”If you previously ran Redpanda without persistence (redpanda.persistence.enabled: false, data on an emptyDir) and switch it on, a plain helm upgrade fails. A StatefulSet’s volume claim template can’t be added after creation, so the API server rejects the change.
Delete the StatefulSet first, then upgrade. The emptyDir only held in-flight events, so this loses nothing a normal restart wouldn’t:
kubectl delete statefulset shoehorn-redpanda -n shoehornhelm upgrade shoehorn ... --values custom-values.yaml --waitRedpanda comes back on a PVC and keeps its data across restarts from then on. The api, worker, crawler, and forge pods get connection errors during the recreate and reconnect once it’s ready.
Meilisearch upgrades
Section titled “Meilisearch upgrades”Meilisearch won’t start when its data directory was written by an older version. When a platform release bumps the bundled search engine, the Meilisearch pod boots the new binary on the old volume and crashloops with a “database version is incompatible” error until you migrate the data.
The chart migrates in place with the --experimental-dumpless-upgrade flag, passed through meilisearch.extraArgs. The migration rewrites the data directory and a plain image downgrade won’t undo it, so back up first.
1. Back up. Trigger a Meilisearch snapshot, and snapshot the PVC (data-shoehorn-meilisearch-0) if your storage class supports VolumeSnapshots.
kubectl -n shoehorn port-forward svc/shoehorn-meilisearch 7700:7700 &curl -X POST http://127.0.0.1:7700/snapshots -H "Authorization: Bearer $MEILI_MASTER_KEY"2. Turn on the flag and raise the startup budget. A large index can take minutes to migrate, and the default startup probe kills the pod after about 150s.
meilisearch: extraArgs: - --experimental-dumpless-upgrade startupProbe: httpGet: { path: /health, port: 7700 } initialDelaySeconds: 10 periodSeconds: 5 failureThreshold: 180 # ~15 min, revert after the upgrade timeoutSeconds: 33. Upgrade and watch the pod come back on the new version.
helm upgrade shoehorn oci://ghcr.io/shoehorn-dev/helm-charts/shoehorn \ --namespace shoehorn -f values.yaml --wait
kubectl -n shoehorn logs -f shoehorn-meilisearch-0kubectl -n shoehorn get pod shoehorn-meilisearch-0 -wRun a search in the app to confirm results come back.
4. Clear the flag. Once the pod is healthy the migration is done. Remove meilisearch.extraArgs, revert the startupProbe override, and upgrade again to keep the flag from running on every restart.
If the pod won’t come up, roll back with helm rollback and restore Meilisearch from the snapshot. For the full step-by-step, see Upgrading Meilisearch in the chart README.
Rollback
Section titled “Rollback”If issues occur after upgrading:
# Rollback to previous Helm releasehelm rollback shoehorn -n shoehorn
# Or rollback to a specific revisionhelm history shoehorn -n shoehornhelm rollback shoehorn <revision-number> -n shoehornDatabase migrations are forward-only. If a migration must be reverted, apply the corresponding down migration manually:
kubectl exec -n shoehorn deploy/api -- \ ./migrate -path /migrations -database "$MIGRATION_DATABASE_URL" down 1K8s Agent Upgrades
Section titled “K8s Agent Upgrades”The K8s agent can be upgraded independently of the main platform:
helm upgrade shoehorn-k8s-agent oci://ghcr.io/shoehorn-dev/helm-charts/shoehorn-k8s-agent \ --namespace shoehorn-agent \ -f agent-values.yamlThe agent is backward-compatible with older API versions. Upgrade the platform first, then agents.
Things to watch during an agent upgrade:
- Leader re-election: The rolling restart triggers a new leader election. There is a brief gap (a few seconds) between the old leader stopping and the new leader starting. No data is lost — events are buffered in the Kubernetes informer cache.
- HA deployments: With a PodDisruptionBudget in place, at least one replica stays available throughout the upgrade. A follower takes over leader election within seconds.
- Configuration validation: If the new agent version introduces stricter configuration parsing, invalid environment variables that were previously silently accepted may cause startup failures. Check pod logs if a new version does not become ready.
See the shoehorn-dev/helm-charts repository for the chart changelog and values reference.