Sovereign DoD D31 — CNPG-backed apps must replicate across the
Sovereign's regions when the operator opts in. PR #1562 wired this
into bp-wordpress-tenant chart-level. This change extends the same
toggle across BOTH user-facing paths:
1. Marketplace tenant flow (sme_tenant_gitops.go)
- smeTenantTemplateData gains EnableHotStandby/PrimaryRegion/
ReplicaRegion. renderSMETenantOverlay reads them from the
catalyst-api Pod env (SOVEREIGN_ENABLE_HOT_STANDBY +
SOVEREIGN_PRIMARY_REGION + SOVEREIGN_REPLICA_REGION).
- Bp-wordpress-tenant HelmRelease emits pg.activeHotStandby.*
when the trio is valid; bp-wordpress-tenant chart 0.2.0+
(PR #1562) renders the primary + replica Cluster CR pair.
- Defence-in-depth: degenerate inputs (empty/identical regions)
fall back to single-Cluster shape rather than emitting a
HelmRelease the chart's validateActiveHotStandbyRegions helper
would fail at template time.
2. Sandbox plane (sandbox.db.provision)
- Env struct + NewEnvFromOS read the same Sovereign-level trio.
- sandbox.db.provision emits a primary + replica Cluster CR pair
when hotStandbyActive() — same shape bp-cnpg-pair renders for
marketplace apps + bp-wordpress-tenant cnpg-cluster.yaml: WAL
streaming via spec.managed.services.additional annotated
service.cilium.io/global=true, nodeAffinity pinning each side
to its declared region, replica.enabled=true with externalCluster
resolving the primary through the ClusterMesh-global Service alias.
- Best-effort rollback if the replica Create fails so the operator
never sees an orphan primary.
3. Plumbing (one knob, both paths)
- catalyst chart: values.sovereign.{enableHotStandby,primaryRegion,
replicaRegion} -> sovereign-fqdn ConfigMap keys -> catalyst-api env.
- sandbox chart: cnpg.activeHotStandby.{enabled,primaryRegion,
replicaRegion} -> controller env -> per-Sandbox MCP Pod env.
- Bootstrap-kit slot 13 + slot 19a wire SOVEREIGN_ENABLE_HOT_STANDBY/
SOVEREIGN_PRIMARY_REGION/SOVEREIGN_REPLICA_REGION envsubst
placeholders to BOTH chart paths so the operator flips one knob
on the per-Sovereign overlay and gets HA across the marketplace
tenant install AND the sandbox.db plane.
Default empty/false: every Sovereign that has not opted in keeps
rendering single-Cluster CNPG (zero regression).
gitlab-tenant + nextcloud-tenant charts: NOT shipped in this repo
today, so they are out of scope. When they land they can copy the
same value contract (pg.activeHotStandby.*) and the gitops writer
wiring already handles them — no chart-bump or controller change
required.
Tests
- sme_tenant_active_hot_standby_test.go: 8 cases (off, on-happy-path,
degenerate matrix incl. empty primary, empty replica, identical
regions, toggle off with regions).
- sandbox_db_hot_standby_test.go: 11 cases covering hotStandbyActive
matrix + replicaClusterName/replicationServiceName suffix rules +
full primary + replica CR shapes (nodeAffinity, switchover, managed
service, externalClusters).
- platform/wordpress-tenant/chart/tests/active-hot-standby-render.sh
still passes (5/5 gates green).
- catalyst-api SMETenant suite GREEN.
- sandbox-controller suite GREEN.
- helm template clean for sandbox chart (HA + default-off) and
catalyst chart (sovereign-fqdn-configmap + api-deployment).
Hard rules respected: READ-ONLY clusters, no Chart.yaml bump on
bp-catalyst-platform (envsubst-only wiring change in slot 13), no
host-cluster touch outside the chart-level seam.
Refs DoD D31.
Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>