fix(multi): Family G — 6 singletons (C8-001/C8-005/C9-006/C10-002/C10-003/C7-007) (#1601)

Wave 2 Family G batched ship. C7-004 (sso/wiki/workflows/storybook +
registry/api HTTPRoutes) intentionally skipped — sso/wiki/storybook
have no shipped backend; registry (harbor) + api (catalyst-api) HTTPRoutes
already exist and 404 is a runtime/HR-readiness symptom, not a missing
route. Flagged for architect-led ticket rather than silent route-alias
synthesis.

C9-006 — hcloud-volumes StorageClass missing on fresh prov
  Root cause: platform/hcloud-csi/chart/ existed but was never wired
  into bootstrap-kit, so fresh Sovereigns defaulted PVCs to local-path
  (rancher.io/local-path) — node-pinned, can't survive Pod reschedule.
  Fix: new slot 17a-bp-hcloud-csi.yaml + chart 1.0.0→1.1.0 bump that
  adds templates/hcloud-token-secret.yaml so the controller can
  authenticate to Hetzner. Mirrors bp-hcloud-ccm (slot 55) +
  bp-cluster-autoscaler-hcloud (slot 50) wiring.

C10-002 — /fleet/applications returns 0 items despite 21 sovereigns
  Root cause: collectFleetSovereigns filtered AdoptedAt!=nil (mirrored
  ListDeployments). On a steady-state fleet every Sovereign is adopted,
  so the dashboard rendered empty despite hundreds of succeeded jobs.
  Fix: remove the adopted-filter from collectFleetSovereigns (the
  fleet view's whole purpose is to enumerate every provisioned
  Sovereign). ListDeployments still applies the filter — it backs the
  provisioner's in-flight tab, a different surface. Adopted rows
  surface with Health=green when otherwise unknown.

C10-003 — per-region install-* Jobs stuck "pending" despite ready
  Root cause: lastState dedup in helmwatch_bridge — secondary
  watchers attaching AFTER an HR already settled at Installed never
  observed a state transition, so the seed value (HelmStatePending)
  never converged. Fix: at markPhase1Done(OutcomeReady), backfill
  every secondary watcher's informer snapshot into the shared
  jobs.Bridge via the idempotent SeedJobsFromInformerList path.
  Runs INLINE (not goroutine) — runPhase1Watch defers
  stopSecondaries() which clears dep.secondaryWatchers as soon as
  markPhase1Done returns, so a goroutine would race the cleanup.

C7-007 — legacy sovereign-wildcard-tls Cert+Secret pair orphaned
  Root cause: PR O moved the Cilium Gateway listener's
  certificateRefs to the dashed-suffix per-zone Secret but left the
  legacy bare-name Certificate template behind, so cert-manager
  kept renewing an orphan. Fix: (a) rename the Certificate +
  Secret to the dashed-suffix shape (single-source-of-truth), and
  (b) add a one-shot Job (legacy-cert-cleanup) that deletes the
  pre-PR-O Cert+Secret pair via alpine/k8s, idempotent for fresh
  provs. Removable from kustomization.yaml once every live prov
  has reconciled past it.

C8-001 — D22 Settings em-dash placeholders on chroot Sovereign
  Root cause: SettingsPage read Capacity / CP size / Pool subdomain /
  BYO domain from useWizardStore() (zustand+persist localStorage).
  The chroot Sovereign console runs on a fresh browser session
  post-handover with empty localStorage, so the four fields rendered
  em-dashes. The data IS persisted on the deployment record
  (RedactedRequest) — gap was that Deployment.State() never surfaced
  it. Fix: lift controlPlaneSize / sovereignPoolDomain /
  sovereignSubdomain / sovereignDomainMode / sovereignByoDomain /
  regionControlPlaneSizes / orgName / orgEmail to the State() map +
  extend DeploymentSnapshot TS type + SettingsPage reads
  snapshot-first with wizard store as fallback (mothership wizard-
  in-flight case).

C8-005 — D20 Jobs page missing region filter dropdown
  Root cause: multi-region Sovereigns expose install-<region>:<chart>
  Jobs but JobsTable offered only status / app / parent filters,
  forcing operators to type the region key into the free-text search.
  Fix: new regionFromJob(job) pure helper parses the canonical
  <region>:<chart> appId (fallback: install-<region>:<chart> jobName).
  Dropdown is visible only when 2+ regions appear in the current job
  set (single-region Sovereigns see no one-option no-op). Sorted
  lexically. Test coverage: 4 helper cases + 3 dropdown cases in
  JobsTable.test.tsx.

Architect-first compliance:
  • bp-hcloud-csi wiring mirrors bp-hcloud-ccm (slot 55) pattern
  • legacy-cert-cleanup uses alpine/k8s (NOT bitnami/kubectl — see
    self-sovereign-cutover/values.yaml:252 Bitnami-deprecation note)
  • alpine/k8s image pulled via harbor.openova.io/proxy-dockerhub
    (mirror-everything rule)
  • regionFromJob mirrors helmwatch_bridge.go componentID encoding
    (3 input shapes: bare, region-prefixed, install-region-prefixed)
  • State() snapshot additions stay slim — only the 4 founder-flagged
    fields + a few zero-cost adjacents

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
e3mrah 2026-05-17 22:20:29 +04:00 committed by GitHub
parent 2d9b2f84bd
commit aa60cfb84e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
17 changed files with 813 additions and 44 deletions

View File

@ -0,0 +1,132 @@
# bp-hcloud-csi — Catalyst bootstrap-kit Blueprint #17a
# (Tier 3.5 — Storage and Data). Pairs with bp-hcloud-ccm (slot 55)
# and bp-cluster-autoscaler-hcloud (slot 50) — the full Hetzner-cloud-
# direct trio.
#
# Wires the Hetzner Cloud CSI driver into the cluster so the canonical
# `hcloud-volumes` StorageClass exists (and is the default StorageClass).
# Without this:
# - PVCs default to `local-path` (rancher.io/local-path), node-pinned
# emptyDir-style hostPath volumes that cannot survive a Pod
# rescheduled to a different node. Multi-node stateful workloads
# (CNPG primary/replica, Harbor blob backend on Hetzner-direct,
# Velero PVC backups) require a CSI-managed networked volume.
# - Operator-facing UI shows `provisioner=rancher.io/local-path` for
# every PVC, breaking the docs/SOVEREIGN-MULTI-REGION-DOD.md C9
# gate which expects `hcloud-volumes default=true`.
#
# 2026-05-17 t143 (C9-006): added to the bootstrap-kit Kustomization
# (clusters/_template/bootstrap-kit/kustomization.yaml). Previously the
# chart existed at platform/hcloud-csi but was not wired into any
# bootstrap-kit slot, so fresh Sovereigns shipped without
# hcloud-volumes despite the Hetzner CSI driver being available in
# the catalog. The Blueprint Release pipeline auto-builds
# bp-hcloud-csi:1.1.0 from the same push that ships this slot.
#
# Wrapper chart: platform/hcloud-csi/chart/ — umbrella over upstream
# hetznercloud/csi-driver chart 2.13.0 (appVersion 2.13.0). Catalyst
# overlay templates render: (a) the `hcloud-volumes` StorageClass with
# the `storageclass.kubernetes.io/is-default-class=true` annotation
# (when defaultStorageClass=true, default in this slot), and
# (b) a chart-local `hcloud-csi-token` Secret rendered from
# `.Values.hetznerToken` via the same valuesFrom seam bp-hcloud-ccm
# uses.
#
# Reconciled by: Flux on the new Sovereign's k3s control plane.
#
# Hetzner-token wiring (mirrors bp-hcloud-ccm at slot 55 +
# bp-cluster-autoscaler-hcloud at slot 50):
# - cloud-init writes `flux-system/cloud-credentials` Secret with the
# `hcloud-token` key (see infra/hetzner/cloudinit-control-plane.tftpl
# §"cloud-credentials-secret").
# - This HelmRelease lifts the `hcloud-token` value into the umbrella
# chart's `hetznerToken` value via Flux `valuesFrom`. The umbrella
# chart's templates/hcloud-token-secret.yaml synthesises the
# namespace-local `hcloud-csi/hcloud-csi-token` Secret the upstream
# subchart's `controller.hcloudToken.existingSecret` binding
# resolves at controller startup.
#
# dependsOn: (none) — hcloud-csi is independent of every other
# bootstrap-kit blueprint at install time. The cloud-credentials Secret
# is provisioned by cloud-init BEFORE Flux installs anything.
#
# Placement in the bootstrap-kit ordering (slot 17a):
# - AFTER 17-valkey (no dependency, just sequencing)
# - BEFORE 18-seaweedfs / 19-harbor — both will consume hcloud-volumes
# for their PVCs on Hetzner-direct Sovereigns once flipped via the
# per-Sovereign overlay (today they still default to local-path so
# this ordering is forward-looking, not strict). The 17a suffix
# mirrors the established 01a/05a/06a convention for inserting
# new slots without renumbering the whole bootstrap kit.
---
apiVersion: v1
kind: Namespace
metadata:
name: hcloud-csi
labels:
catalyst.openova.io/sovereign: ${SOVEREIGN_FQDN}
---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: bp-hcloud-csi
namespace: flux-system
spec:
type: oci
interval: 15m
url: oci://ghcr.io/openova-io
secretRef:
name: ghcr-pull
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: bp-hcloud-csi
namespace: flux-system
labels:
catalyst.openova.io/slot: "17a"
spec:
interval: 15m
releaseName: hcloud-csi
targetNamespace: hcloud-csi
chart:
spec:
chart: bp-hcloud-csi
version: 1.1.0
sourceRef:
kind: HelmRepository
name: bp-hcloud-csi
namespace: flux-system
# Event-driven install: hcloud-csi controller + node DaemonSet are
# standard CSI workloads — Helm install completes when manifests
# apply. The driver's Hetzner-API connectivity check is a runtime
# concern, not a Helm-wait concern. disableWait keeps Flux's Ready
# signal aligned with manifest apply (matches the bp-hcloud-ccm
# pattern at slot 55).
install:
timeout: 15m
disableWait: true
remediation:
retries: 3
upgrade:
timeout: 15m
disableWait: true
remediation:
retries: 3
# ── Hetzner-token wiring ─────────────────────────────────────────────
# Pulls the `hcloud-token` key from the canonical
# `flux-system/cloud-credentials` Secret cloud-init writes at Phase 0.
# Flux dereferences `valuesFrom` at HelmRelease apply time, so the
# plaintext payload never appears in this committed manifest.
valuesFrom:
- kind: Secret
name: cloud-credentials
valuesKey: hcloud-token
targetPath: hetznerToken
# Enable the chart + flip hcloud-volumes to the cluster default.
# On a fresh Sovereign there are no pre-existing PVCs bound to
# `local-path`, so flipping the default at install time is safe.
values:
enabled: true
defaultStorageClass: true

View File

@ -24,6 +24,15 @@ resources:
- 15a-external-secrets-stores.yaml
- 16-cnpg.yaml
- 17-valkey.yaml
# bp-hcloud-csi (slot 17a) — Hetzner Cloud CSI driver + the canonical
# `hcloud-volumes` StorageClass (annotated as default). Pairs with
# bp-hcloud-ccm (slot 55) + bp-cluster-autoscaler-hcloud (slot 50) as
# the Hetzner-cloud-direct trio. Without this slot, fresh Sovereigns
# default PVCs to `local-path` (rancher.io/local-path) which is
# node-pinned and cannot survive a Pod rescheduled to a different
# node — breaks docs/SOVEREIGN-MULTI-REGION-DOD.md C9 (operator
# expects `hcloud-volumes default=true`). Caught on t10 2026-05-17.
- 17a-bp-hcloud-csi.yaml
- 18-seaweedfs.yaml
- 19-harbor.yaml
# 06a — Post-handover Self-Sovereignty Cutover (issue #791). Filename

View File

@ -91,7 +91,15 @@ metadata:
rules:
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["sovereign-wildcard-tls"]
# 2026-05-17 t143 dual-cert collision cleanup: the per-zone Secret
# the Cilium Gateway now references is named
# `sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}`
# (see clusters/_template/sovereign-tls/cilium-gateway.yaml:44 +
# clusters/_template/sovereign-tls/cilium-gateway-cert.yaml). The
# legacy `sovereign-wildcard-tls` (no dashed suffix) is no longer
# produced anywhere — drop it from the resourceNames allowlist so
# this Role grants the minimum needed for the live Secret name.
resourceNames: ["sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}"]
verbs: ["get", "watch", "list"]
- apiGroups: ["apps"]
resources: ["daemonsets"]
@ -209,7 +217,14 @@ spec:
set -eu
SECRET_NS=kube-system
SECRET_NAME=sovereign-wildcard-tls
# 2026-05-17 t143 dual-cert collision cleanup: the canonical
# SDS Secret the Cilium Gateway now references is the
# per-zone `sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}`.
# Cloud-init substitutes SOVEREIGN_FQDN_DASHED via Flux
# postBuild.substitute, so the literal cluster value lands
# here at apply time (verified in
# infra/hetzner/cloudinit-control-plane.tftpl §SOVEREIGN_FQDN_DASHED).
SECRET_NAME=sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}
DS_NS=kube-system
DS_NAME=cilium-envoy

View File

@ -19,17 +19,21 @@
# - gitea.<fqdn> → 5 reprovs/week
# ... × 12 hostnames = 60 effective reprov-slots/week
#
# Coexistence: the `sovereign-wildcard-tls` Secret name was the single
# point of integration with the Cilium Gateway listener
# (cilium-gateway.yaml). With per-name certs we still write ONE Secret
# of that name BUT it's now a SAN-Certificate containing ALL N
# hostnames as SubjectAltNames — cert-manager bundles them into one
# Order with N identifiers. LE counts a SAN cert as ONE issuance
# against EACH identifier's bucket, but only ONE issuance overall.
# So our 168h budget becomes:
# min(5/168h per hostname bucket) — typically reprovs share the same
# bucket per name, but adding a NEW hostname creates a FRESH bucket
# and resets that hostname's count to 0.
# 2026-05-17 t143 dual-cert collision cleanup
# -------------------------------------------
# Previously this Certificate was named `sovereign-wildcard-tls` and
# wrote a Secret of the same name. After PR O (2026-05-17) moved the
# Cilium Gateway listener's certificateRefs to the per-zone Secret
# `sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}` (see
# clusters/_template/sovereign-tls/cilium-gateway.yaml:44), the legacy
# Secret stopped being referenced by anything — but the Certificate
# kept renewing, burning LE budget for no production value and showing
# up in audits as an orphan TLS Secret on every Sovereign.
#
# Single-source-of-truth fix: this Certificate now writes to the SAME
# dashed-suffix Secret the Gateway already references. One Cert, one
# Secret, one LE issuance per renewal. No more dual-cert collision
# and no extra LE budget consumed.
#
# This pattern is the standard production approach (see Cloudflare,
# Vercel, Render). Wildcards are reserved for the limited cases where
@ -38,13 +42,17 @@
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: sovereign-wildcard-tls # name kept for backwards-compat with Gateway listener ref
# Match the Secret name the Gateway listener references
# (clusters/_template/sovereign-tls/cilium-gateway.yaml:44). Cloud-init
# substitutes SOVEREIGN_FQDN_DASHED = SOVEREIGN_FQDN with `.` → `-`
# (infra/hetzner/cloudinit-control-plane.tftpl §SOVEREIGN_FQDN_DASHED).
name: sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}
namespace: kube-system
labels:
catalyst.openova.io/sovereign: ${SOVEREIGN_FQDN}
catalyst.openova.io/component: cilium-gateway
spec:
secretName: sovereign-wildcard-tls
secretName: sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}
issuerRef:
name: ${WILDCARD_CERT_ISSUER}
kind: ClusterIssuer

View File

@ -11,3 +11,10 @@ resources:
# the cert appearing. See file header for full root cause + design
# rationale (qa-loop bounded-cycle Provision #7).
- cilium-envoy-tls-restart-job.yaml
# C7-007 (2026-05-17 t143) — one-shot cleanup of the pre-PR-O legacy
# `sovereign-wildcard-tls` Certificate + Secret pair. Idempotent
# (`--ignore-not-found`), runs once per Flux reconciliation
# generation. Fresh Sovereigns succeed as a no-op; pre-PR-O
# Sovereigns delete the orphan resources. Removable from the list
# once every live prov has reconciled past it.
- legacy-cert-cleanup-job.yaml

View File

@ -0,0 +1,151 @@
# C7-007 (2026-05-17 t143) — one-shot cleanup Job for the legacy
# `sovereign-wildcard-tls` Certificate + Secret pair.
#
# Background
# ----------
# Pre-PR-O Sovereigns rendered a Certificate named `sovereign-wildcard-tls`
# (with a Secret of the same name) AND, after PR O moved the Cilium
# Gateway listener to the per-zone `sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}`
# Secret, the legacy Certificate kept renewing on cert-manager's
# default schedule. Result: every audit on a pre-PR-O Sovereign showed
# an orphan TLS Secret in kube-system, cert-manager wasted LE budget
# renewing a Secret nothing consumed, and operators had to remember to
# `kubectl delete` it after every Flux reconciliation re-asserted the
# legacy resource (which it no longer does — PR O's `cilium-gateway-cert.yaml`
# now produces ONLY the dashed-suffix shape).
#
# What this Job does
# ------------------
# Idempotent delete of:
# 1. `kube-system/sovereign-wildcard-tls` Certificate (cert-manager.io/v1)
# 2. `kube-system/sovereign-wildcard-tls` Secret (kubernetes.io/tls)
#
# Each delete is `--ignore-not-found` so a fresh Sovereign that never
# carried the legacy shape reports "no-op" and Succeeds. The Job runs
# ONCE per Flux reconciliation generation (the helm.sh/hook
# annotations on the bp-self-sovereign-cutover chart aren't applicable
# here because this lives in the per-Sovereign overlay, not a Helm
# chart — Flux's Kustomization re-applies idempotently).
#
# Image
# -----
# Uses the canonical OpenOva-mirrored alpine/k8s image (mothership
# Harbor proxy-cache for Docker Hub, per CLAUDE.md mirror rule).
# Bitnami/kubectl was deprecated 2025-08; alpine/k8s is the standard
# replacement (see platform/self-sovereign-cutover/chart/values.yaml:252
# for the canonical reasoning, captured live on otech103 2026-05-04).
#
# Why a Job and not a Helm hook
# -----------------------------
# This file lives in `clusters/_template/sovereign-tls/` — a per-Sovereign
# Kustomize overlay reconciled by Flux, NOT a Helm chart. Helm hooks
# require a HelmRelease container; this is a single one-shot K8s Job.
# Flux's Kustomization reconciliation drives idempotent re-apply.
#
# Removal plan
# ------------
# Once every live Sovereign has reconciled past this Job (verified via
# `kubectl get jobs -n kube-system | grep legacy-cert-cleanup` showing
# Complete on every prov), this file may be deleted from
# clusters/_template/sovereign-tls/kustomization.yaml.
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: legacy-cert-cleanup
namespace: kube-system
labels:
catalyst.openova.io/component: legacy-cert-cleanup
catalyst.openova.io/sovereign: ${SOVEREIGN_FQDN}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: legacy-cert-cleanup
namespace: kube-system
labels:
catalyst.openova.io/component: legacy-cert-cleanup
rules:
# Legacy Secret to delete. Only the specific name — RBAC stays
# least-privilege.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["sovereign-wildcard-tls"]
verbs: ["get", "delete"]
# cert-manager Certificate to delete. Only the specific name.
- apiGroups: ["cert-manager.io"]
resources: ["certificates"]
resourceNames: ["sovereign-wildcard-tls"]
verbs: ["get", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: legacy-cert-cleanup
namespace: kube-system
labels:
catalyst.openova.io/component: legacy-cert-cleanup
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: legacy-cert-cleanup
subjects:
- kind: ServiceAccount
name: legacy-cert-cleanup
namespace: kube-system
---
apiVersion: batch/v1
kind: Job
metadata:
name: legacy-cert-cleanup
namespace: kube-system
labels:
catalyst.openova.io/component: legacy-cert-cleanup
catalyst.openova.io/sovereign: ${SOVEREIGN_FQDN}
spec:
# Keep the Job around 5 minutes after completion so an operator can
# `kubectl logs job/legacy-cert-cleanup -n kube-system` to confirm
# what was (or wasn't) cleaned up. After TTL the GC reclaims.
ttlSecondsAfterFinished: 300
backoffLimit: 2
template:
metadata:
labels:
catalyst.openova.io/component: legacy-cert-cleanup
spec:
serviceAccountName: legacy-cert-cleanup
restartPolicy: OnFailure
containers:
- name: cleanup
# Pinned via Harbor proxy-cache. See CLAUDE.md mirror-everything
# rule + values.yaml:252 in self-sovereign-cutover for the
# Bitnami→alpine/k8s decision history.
image: harbor.openova.io/proxy-dockerhub/alpine/k8s:1.31.1
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c"]
args:
- |
set -eu
echo "[legacy-cert-cleanup] starting on ${SOVEREIGN_FQDN}"
# The dashed-suffix Secret (the live one PR O introduced)
# MUST remain — only delete the bare-name legacy pair.
echo "[legacy-cert-cleanup] removing legacy Certificate sovereign-wildcard-tls"
kubectl -n kube-system delete certificate.cert-manager.io sovereign-wildcard-tls --ignore-not-found=true --wait=false
echo "[legacy-cert-cleanup] removing legacy Secret sovereign-wildcard-tls"
kubectl -n kube-system delete secret sovereign-wildcard-tls --ignore-not-found=true --wait=false
echo "[legacy-cert-cleanup] complete"
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 65532
capabilities:
drop: ["ALL"]
resources:
requests:
cpu: "10m"
memory: "32Mi"
limits:
cpu: "100m"
memory: "64Mi"

View File

@ -1,6 +1,13 @@
apiVersion: v2
name: bp-hcloud-csi
version: 1.0.0
# 1.1.0 (2026-05-17 t143 C9-006): add templates/hcloud-token-secret.yaml
# so the chart self-renders the `hcloud-csi-token` Secret from
# `.Values.hetznerToken` (populated via Flux valuesFrom from
# flux-system/cloud-credentials). Without this Secret the controller
# pods cannot authenticate to the Hetzner API; the StorageClass exists
# but every PVC fails to provision with a 401 from the CSI driver.
# Mirrors bp-hcloud-ccm 1.0.0 wiring.
version: 1.1.0
description: |
Catalyst-curated Blueprint umbrella chart for the Hetzner Cloud CSI
driver. Provides the hcloud-volumes StorageClass for multi-node stateful

View File

@ -0,0 +1,47 @@
{{/*
Hetzner API token Secret consumed by the hcloud-csi controller.
Rendered into the chart's targetNamespace (`hcloud-csi` by convention)
from a value sourced via Flux `valuesFrom` against the canonical
`flux-system/cloud-credentials` Secret (key `hcloud-token`). Mirrors the
pattern used by bp-hcloud-ccm and bp-cluster-autoscaler-hcloud — see
platform/hcloud-ccm/chart/templates/hcloud-token-secret.yaml for the
matching shape and ADR-0001 §11.3 for the cloud-init seam.
The bp-hcloud-csi subchart's controller looks up the Secret by name
(default `hcloud-csi-token`, key `token`) — see
.Values.hetznerTokenSecretRef + the upstream
hcloud-csi.controller.hcloudToken.existingSecret binding in values.yaml.
The Secret is only rendered when:
- .Values.enabled is true (master gate; the rest of the chart's
rendering is gated on the same value)
- .Values.hetznerToken is non-empty (Flux `valuesFrom` populates
this from cloud-credentials at HelmRelease apply time)
When .Values.hetznerToken is empty Helm skips this template entirely so
a per-Sovereign overlay that switches to an externally-managed
ExternalSecret (Phase 2+) can take over without collision.
2026-05-17 t143 (C9-006): created so the bootstrap-kit slot
17a-bp-hcloud-csi.yaml wires the token in the same shape as
55-bp-hcloud-ccm.yaml does — without this Secret the hcloud-csi
controller cannot authenticate to the Hetzner API, the StorageClass
exists but every PVC fails to provision with a 401 from the CSI driver.
*/}}
{{- if and .Values.enabled .Values.hetznerToken }}
---
apiVersion: v1
kind: Secret
metadata:
name: {{ .Values.hetznerTokenSecretRef.name | default "hcloud-csi-token" | quote }}
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: bp-hcloud-csi
app.kubernetes.io/component: hcloud-token
catalyst.openova.io/blueprint: bp-hcloud-csi
catalyst.openova.io/blueprint-version: {{ .Chart.Version | quote }}
type: Opaque
stringData:
{{ .Values.hetznerTokenSecretRef.key | default "token" }}: {{ .Values.hetznerToken | quote }}
{{- end }}

View File

@ -19,6 +19,20 @@ hetznerTokenSecretRef:
name: hcloud-csi-token
key: token
# 2026-05-17 t143 (C9-006): Hetzner API token plaintext. Default empty —
# Flux `valuesFrom` populates this at HelmRelease apply time from the
# canonical flux-system/cloud-credentials Secret (key `hcloud-token`)
# cloud-init writes during Phase 0 (mirrors bp-hcloud-ccm wiring at
# clusters/_template/bootstrap-kit/55-bp-hcloud-ccm.yaml). When
# non-empty, templates/hcloud-token-secret.yaml renders the
# `<hetznerTokenSecretRef.name>` Secret in the chart's targetNamespace
# so the subchart's controller can authenticate to the Hetzner API.
#
# Per docs/INVIOLABLE-PRINCIPLES.md #10 (credentials never on CR / Git),
# this stays empty in committed YAML; the live value lands at apply
# time from cloud-credentials and is never persisted to Git.
hetznerToken: ""
# Catalyst-managed StorageClass list. Each entry renders an independent
# StorageClass — operators can add fast-ssd / archive variants per
# Sovereign without editing this chart. Named `catalystStorageClasses`

View File

@ -750,6 +750,56 @@ func (d *Deployment) State() map[string]any {
// blank (legacy record).
"ownerEmail": d.OwnerEmail,
}
// C8-001 (2026-05-17 t143): lift the Sovereign-provisioning request
// fields that the chroot's /sovereign/settings page renders so the
// page works on a fresh chroot session (where the operator's
// browser-side wizard-store is empty). The fields are non-secret
// projections of the wizard submit (control-plane size, pool
// subdomain, BYO domain) — they live on the deployment record's
// RedactedRequest already, the gap was only that State() never
// surfaced them. Founder caught on t136 2026-05-17 — Settings page
// shows four em-dash placeholders for Capacity / CP size / Pool
// subdomain / BYO domain on the chroot Sovereign console because
// the chroot has no localStorage'd wizard store to read from.
if v := d.Request.ControlPlaneSize; v != "" {
out["controlPlaneSize"] = v
}
if v := d.Request.SovereignPoolDomain; v != "" {
out["sovereignPoolDomain"] = v
}
if v := d.Request.SovereignSubdomain; v != "" {
out["sovereignSubdomain"] = v
}
if v := d.Request.SovereignDomainMode; v != "" {
out["sovereignDomainMode"] = v
}
// BYO-domain is encoded on RedactedRequest only when domainMode
// is `byo`; we still emit when present so the chroot Settings page
// can render it. Pool-mode deployments leave this empty.
if v := d.Request.SovereignFQDN; v != "" && d.Request.SovereignDomainMode == "byo" {
out["sovereignByoDomain"] = v
}
// Per-region control-plane sizes (multi-region Sovereigns). The
// Settings page falls back to controlPlaneSize when the array is
// empty; surface both so future per-region renderings need no
// API extension.
if len(d.Request.Regions) > 0 {
sizes := make([]string, 0, len(d.Request.Regions))
for _, r := range d.Request.Regions {
sizes = append(sizes, r.ControlPlaneSize)
}
out["regionControlPlaneSizes"] = sizes
}
// Org-profile fields (non-secret). Same rationale as the sovereign
// fields above — the chroot Settings page would render four
// em-dashes for Name / Billing email / Industry / Headquarters
// otherwise.
if v := d.Request.OrgName; v != "" {
out["orgName"] = v
}
if v := d.Request.OrgEmail; v != "" {
out["orgEmail"] = v
}
if !d.FinishedAt.IsZero() {
out["finishedAt"] = d.FinishedAt.Format(time.RFC3339)
}

View File

@ -382,14 +382,25 @@ func (h *Handler) HandleFleetApplications(w http.ResponseWriter, r *http.Request
// collectFleetSovereigns — every Sovereign known to this catalyst-api
// process. Source: the in-memory deployments map (rehydrated from the
// PVC at startup), filtered to drop adopted-but-still-tracked records
// the same way ListDeployments does. Sorted by FQDN for deterministic
// pagination.
// PVC at startup). Sorted by FQDN for deterministic pagination.
//
// Per ADR-0001 §2.7 — no separate fleet database. The deployments map
// IS the source of truth on this Pod; tenant_registry is the secondary
// source for SME-tier Sovereigns the same map doesn't track (those are
// collapsed into the same shape so the caller sees one fleet view).
//
// 2026-05-17 t143 (C10-002) — adopted Sovereigns INCLUDED.
// Previously this helper filtered out every dep with AdoptedAt != nil
// (mirroring ListDeployments). The result: on a steady-state fleet
// where every Sovereign has completed cutover and been adopted by its
// customer's console, the cross-Sovereign Applications dashboard
// (/fleet/applications) returned `items=[]` despite the fleet
// containing 21 live Sovereigns and 110 succeeded jobs (caught on t10
// 2026-05-17). The fleet view's whole purpose is to enumerate every
// Sovereign mothership has ever provisioned — adopted is the
// steady-state, not a reason to hide. ListDeployments' boundary
// (handover hides the row from the provisioner's "in-flight" tab)
// does NOT apply to the fleet dashboard.
func (h *Handler) collectFleetSovereigns(_ context.Context) []fleetSovereignSummary {
out := make([]fleetSovereignSummary, 0)
seen := make(map[string]bool)
@ -400,14 +411,6 @@ func (h *Handler) collectFleetSovereigns(_ context.Context) []fleetSovereignSumm
return true
}
dep.mu.Lock()
if dep.AdoptedAt != nil {
// Adopted Sovereigns are owned by the customer's
// console.<sovereign-fqdn> — they no longer surface
// in the mothership fleet view (same boundary
// ListDeployments enforces).
dep.mu.Unlock()
return true
}
row := fleetSovereignSummary{
ID: dep.ID,
FQDN: dep.Request.SovereignFQDN,
@ -418,6 +421,14 @@ func (h *Handler) collectFleetSovereigns(_ context.Context) []fleetSovereignSumm
if !dep.StartedAt.IsZero() {
row.CreatedAt = dep.StartedAt.UTC().Format(time.RFC3339)
}
// Adopted Sovereigns report Health=green because cutover
// drove the deployment status to "ready" before the
// AdoptedAt timestamp landed. We surface them with the same
// health vocabulary as in-flight rows so the dashboard's
// per-card badge keeps working.
if dep.AdoptedAt != nil && row.Health == healthUnknown {
row.Health = healthGreen
}
dep.mu.Unlock()
if !seen[row.ID] {

View File

@ -247,9 +247,19 @@ func TestHandleFleetSovereigns_Pagination(t *testing.T) {
}
}
// ── /fleet/sovereigns: adopted excluded ──────────────────────────────
func TestHandleFleetSovereigns_AdoptedExcluded(t *testing.T) {
// ── /fleet/sovereigns: adopted INCLUDED ─────────────────────────────
//
// 2026-05-17 t143 (C10-002): adopted Sovereigns are INCLUDED in the
// fleet view (formerly excluded). Rationale: the fleet view's whole
// purpose is to enumerate every Sovereign mothership has ever
// provisioned — adopted is the steady state, not a reason to hide.
// On a real fleet where every Sovereign has completed cutover (as
// happens after handover), the previous filter returned items=[]
// despite the deployments map carrying dozens of live Sovereigns and
// hundreds of succeeded jobs. The dashboard's empty-state spawned the
// C10-002 ticket. ListDeployments still applies the adopted filter
// (it backs the provisioner's "in-flight" tab, a different surface).
func TestHandleFleetSovereigns_AdoptedIncluded(t *testing.T) {
h := NewWithPDM(silentLogger(), &fakePDM{})
installFleetSovereign(t, h, "sov-live", "live.example.com", "ready")
adopted := installFleetSovereign(t, h, "sov-handed", "handed.example.com", "adopted")
@ -259,8 +269,15 @@ func TestHandleFleetSovereigns_AdoptedExcluded(t *testing.T) {
rec := callUserAccess(t, h, http.MethodGet, "/api/v1/fleet/sovereigns", nil, registerFleetRoutes)
var resp fleetSovereignsResponse
_ = json.Unmarshal(rec.Body.Bytes(), &resp)
if resp.Total != 1 || resp.Sovereigns[0].ID != "sov-live" {
t.Fatalf("expected only sov-live; got %+v", resp.Sovereigns)
if resp.Total != 2 {
t.Fatalf("expected 2 sovereigns (live + adopted); got total=%d body=%+v", resp.Total, resp.Sovereigns)
}
// Sort is by FQDN ascending; handed.example.com < live.example.com
if got := resp.Sovereigns[0].ID; got != "sov-handed" {
t.Fatalf("first sovereign id: got %q want sov-handed (FQDN sort)", got)
}
if got := resp.Sovereigns[1].ID; got != "sov-live" {
t.Fatalf("second sovereign id: got %q want sov-live", got)
}
}

View File

@ -742,6 +742,103 @@ func (h *Handler) markPhase1Done(dep *Deployment, finalStates map[string]string,
// per-region LB IP wait loops (each up to 5 min).
// docs/SOVEREIGN-MULTI-REGION-DOD.md gates D9-D12.
go h.runAutoEstablishClusterMesh(dep)
// C10-003 (2026-05-17 t143): when Phase-1 reaches
// OutcomeReady, the PRIMARY's terminate path persists the
// final per-Job status from its own helmwatch state map.
// Secondary regions' install-* Jobs live on the per-region
// bridge but are wired via separate watcher event streams
// (spawnSecondaryRegionWatchers above), and stale events
// (e.g. a transient HelmStatePending observed during initial
// dep-not-ready cycles, then suppressed by lastState dedup
// before the Installed transition was ever observed) can
// leave their Job rows pinned to "pending" even though
// kubectl reports every HR Ready=True. Founder-flagged on
// t10 2026-05-17 (install-nbg1-1:*, install-sin-2:* stuck
// pending despite deployment status=ready).
//
// Re-seed every secondary watcher from its current
// informer cache so each install-<region>:<chart> Job row
// converges onto the cluster-current HelmState. The seed
// path is idempotent (mergeJob preserves monotonic
// timestamps + non-empty DependsOn; SeedJobsFromInformerList
// matches OnHelmReleaseEvent's Status mapping), so this is
// safe to call multiple times.
//
// CRITICAL: invoke INLINE, not on a goroutine — runPhase1Watch
// holds `defer stopSecondaries()` which clears
// dep.secondaryWatchers as soon as markPhase1Done returns.
// A go-spawned backfill would race the cleanup and observe
// an empty map ~50% of the time. The backfill itself is
// in-memory work (informer snapshot + bridge merge), no
// network I/O — running it on the terminate path's stack
// adds ≤100ms before markPhase1Done's caller resumes.
h.runSecondaryBridgeBackfill(dep)
}
}
// runSecondaryBridgeBackfill walks every secondary watcher attached to
// the deployment, snapshots each one's informer cache, and reseeds the
// shared jobs.Bridge with the cluster-current state. This is the
// recovery path for C10-003 — secondary install Jobs stuck "pending"
// after deployment status=ready, caused by a transient event lost to
// the bridge's lastState dedup (the seed observed HelmStatePending at
// initial-list, the Installed transition never produced a distinct
// event because the watcher attached AFTER the HR had already settled
// at Installed — same state, dedup suppresses, status stays pending).
//
// Run INLINE from markPhase1Done — runPhase1Watch's
// `defer stopSecondaries()` clears dep.secondaryWatchers immediately
// after markPhase1Done returns, so a goroutine-spawned backfill would
// race the cleanup. The work is in-memory only (informer snapshot +
// bridge merge); no network I/O justifies a goroutine.
//
// Errors are logged at warn; this is a best-effort convergence helper,
// not a correctness gate.
func (h *Handler) runSecondaryBridgeBackfill(dep *Deployment) {
defer func() {
if r := recover(); r != nil {
h.log.Error("secondary bridge backfill: panic recovered",
"id", dep.ID,
"panic", r,
)
}
}()
dep.mu.Lock()
watchers := make(map[string]*helmwatch.Watcher, len(dep.secondaryWatchers))
for region, w := range dep.secondaryWatchers {
watchers[region] = w
}
bridge := dep.jobsBridge
dep.mu.Unlock()
if bridge == nil || len(watchers) == 0 {
return
}
for region, watcher := range watchers {
if watcher == nil {
continue
}
snap := watcher.SnapshotComponents()
if len(snap) == 0 {
continue
}
seeds := snapshotsToSeedsForRegion(snap, region)
jobsCount, execsSeeded, err := bridge.SeedJobsFromInformerList(seeds)
if err != nil {
h.log.Warn("secondary bridge backfill: reseed failed",
"id", dep.ID,
"region", region,
"snapshotCount", len(snap),
"err", err,
)
continue
}
h.log.Info("secondary bridge backfill: reseeded from informer cache",
"id", dep.ID,
"region", region,
"snapshotCount", len(snap),
"jobsWritten", jobsCount,
"executionsSeeded", execsSeeded,
)
}
}

View File

@ -33,6 +33,7 @@ import {
compareJobs,
formatDuration,
matchJob,
regionFromJob,
} from './JobsTable'
import { FIXTURE_JOBS } from '@/test/fixtures/jobs.fixture'
import type { Job } from '@/lib/jobs.types'
@ -326,3 +327,79 @@ describe('JobsTable — render', () => {
expect(screen.getByTestId('jobs-cell-status-bp-vault').textContent?.toLowerCase()).toContain('pending')
})
})
// ── C8-005 (2026-05-17 t143): region filter helpers + dropdown ───────
describe('regionFromJob (C8-005)', () => {
it('returns empty for primary-region rows (no `:` in appId)', () => {
expect(regionFromJob({ jobName: 'Install cilium', appId: 'bp-cilium' })).toBe('')
})
it('extracts region from a `<region>:<chart>` appId', () => {
expect(regionFromJob({ jobName: 'Install cilium', appId: 'fsn1:bp-cilium' })).toBe('fsn1')
})
it('handles hyphenated region keys', () => {
expect(regionFromJob({ jobName: 'Install cilium', appId: 'hel1-2:bp-cilium' })).toBe('hel1-2')
})
it('falls back to parsing `install-<region>:<chart>` jobName when appId is empty', () => {
expect(regionFromJob({ jobName: 'install-nbg1-1:bp-flux', appId: '' })).toBe('nbg1-1')
})
it('returns empty for group/day-2 rows with no parseable region', () => {
expect(regionFromJob({ jobName: 'applications', appId: '' })).toBe('')
})
})
describe('JobsTable region filter (C8-005)', () => {
const baseLeaf = {
type: 'install' as const,
parentId: 'applications',
childIds: [],
dependsOn: [],
status: 'succeeded' as const,
startedAt: '2026-05-17T10:00:00Z',
finishedAt: '2026-05-17T10:01:00Z',
durationMs: 60_000,
}
it('hides the region dropdown on single-region deployments', async () => {
const singleRegion: Job[] = [
{ ...baseLeaf, id: 'bp-cilium', jobName: 'Install Cilium', appId: 'bp-cilium' },
{ ...baseLeaf, id: 'bp-flux', jobName: 'Install Flux', appId: 'bp-flux' },
]
renderTable({ jobs: singleRegion })
await screen.findByTestId('jobs-table')
expect(screen.queryByTestId('jobs-filter-region')).toBeNull()
})
it('shows the region dropdown when 2+ regions appear', async () => {
const multiRegion: Job[] = [
{ ...baseLeaf, id: 'bp-cilium', jobName: 'Install Cilium', appId: 'bp-cilium' },
{ ...baseLeaf, id: 'fsn1:bp-cilium', jobName: 'install-fsn1:bp-cilium', appId: 'fsn1:bp-cilium' },
{ ...baseLeaf, id: 'hel1-2:bp-cilium', jobName: 'install-hel1-2:bp-cilium', appId: 'hel1-2:bp-cilium' },
]
renderTable({ jobs: multiRegion })
await screen.findByTestId('jobs-table')
const sel = screen.getByTestId('jobs-filter-region') as HTMLSelectElement
expect(sel).toBeTruthy()
// Options: All + 2 regions (sorted lexically: fsn1, hel1-2)
const opts = Array.from(sel.querySelectorAll('option')).map((o) => o.textContent)
expect(opts).toEqual(['All', 'fsn1', 'hel1-2'])
})
it('filters rows to the selected region', async () => {
const multiRegion: Job[] = [
{ ...baseLeaf, id: 'bp-cilium', jobName: 'Install Cilium', appId: 'bp-cilium' },
{ ...baseLeaf, id: 'fsn1:bp-cilium', jobName: 'install-fsn1:bp-cilium', appId: 'fsn1:bp-cilium' },
{ ...baseLeaf, id: 'hel1-2:bp-cilium', jobName: 'install-hel1-2:bp-cilium', appId: 'hel1-2:bp-cilium' },
]
renderTable({ jobs: multiRegion })
await screen.findByTestId('jobs-table')
fireEvent.change(screen.getByTestId('jobs-filter-region'), { target: { value: 'fsn1' } })
const rows = screen.getAllByTestId(/^jobs-table-row-/)
expect(rows.length).toBe(1)
expect(screen.queryByTestId('jobs-table-row-bp-cilium')).toBeNull()
expect(screen.queryByTestId('jobs-table-row-hel1-2:bp-cilium')).toBeNull()
})
})

View File

@ -76,6 +76,43 @@ export function compareJobs(a: Job, b: Job): number {
return a.id.localeCompare(b.id)
}
/**
* regionFromJob extract the Hetzner region key from a Job's
* `jobName` / `appId`. Multi-region deployments use a
* `<region>:<chart>` prefix in the AppID, and an `install-<region>:<chart>`
* jobName. The canonical region encoding is documented in
* products/catalyst/bootstrap/api/internal/jobs/helmwatch_bridge.go:503
* (three input shapes: bare chart, region-prefixed, install-region-prefixed).
*
* Returns the empty string for primary-region rows (no `:` separator)
* so the region filter dropdown's "All" option naturally matches them.
* Day-2 mutation rows and groups have empty appId and return ''.
*
* Exported so the unit test in JobsTable.test.tsx can lock in the
* contract.
*/
export function regionFromJob(job: Pick<Job, 'jobName' | 'appId'>): string {
// Prefer the AppID encoding because it's the canonical key the
// backend uses (helmwatch_bridge.go's `componentID` is
// `<region>:<chart>` for secondaries, bare for primary).
if (job.appId) {
const sep = job.appId.indexOf(':')
if (sep > 0) return job.appId.substring(0, sep)
}
// Fallback: parse the jobName when AppID is empty (group rows /
// pre-bridge legacy rows).
if (job.jobName) {
// Strip the canonical `install-` prefix, then check for the
// region separator. Anything before `:` is the region.
const stripped = job.jobName.startsWith('install-')
? job.jobName.slice('install-'.length)
: job.jobName
const sep = stripped.indexOf(':')
if (sep > 0) return stripped.substring(0, sep)
}
return ''
}
/**
* Search predicate matches across jobName / appId / dependsOn /
* status / parentId. Case-insensitive substring match. Exported so
@ -166,6 +203,10 @@ export function JobsTable({ jobs, appIdFilter, initialParentFilter }: JobsTableP
const [statusFilter, setStatusFilter] = useState<'' | JobStatus>('')
const [appFilter, setAppFilter] = useState<string>('')
const [parentFilter, setParentFilter] = useState<string>('')
// D20 (2026-05-17 t143): region filter dropdown so operators on a
// multi-region Sovereign can scope the table to one region without
// typing the region key into the search box. Empty string = "All".
const [regionFilter, setRegionFilter] = useState<string>('')
// Resolve parent display labels — used in the Parent column + filter.
const parentLabelById = useMemo<Map<string, string>>(() => {
@ -197,6 +238,19 @@ export function JobsTable({ jobs, appIdFilter, initialParentFilter }: JobsTableP
.sort((a, b) => a.label.localeCompare(b.label))
}, [jobs, parentLabelById])
// D20 (2026-05-17 t143): unique non-empty region keys present in the
// current job set. Sorted lexically so operators see a stable order
// (fsn1, hel1-2, nbg1-1, sin-2). Hidden when only one region (or
// zero) appears — the filter would be a one-option no-op.
const regionOptions = useMemo<string[]>(() => {
const set = new Set<string>()
for (const j of jobs) {
const r = regionFromJob(j)
if (r) set.add(r)
}
return [...set].sort((a, b) => a.localeCompare(b))
}, [jobs])
const visibleJobs = useMemo<Job[]>(() => {
const filtered = jobs.filter((j) => {
// Hide group rows by default — they appear in the canvas as
@ -209,11 +263,12 @@ export function JobsTable({ jobs, appIdFilter, initialParentFilter }: JobsTableP
if (statusFilter && j.status !== statusFilter) return false
if (appFilter && j.appId !== appFilter) return false
if (parentFilter && j.parentId !== parentFilter) return false
if (regionFilter && regionFromJob(j) !== regionFilter) return false
if (!matchJob(j, search)) return false
return true
})
return [...filtered].sort(compareJobs)
}, [jobs, search, statusFilter, appFilter, parentFilter, appIdFilter, initialParentFilter])
}, [jobs, search, statusFilter, appFilter, parentFilter, regionFilter, appIdFilter, initialParentFilter])
return (
<div className="jobs-table-wrap" data-testid="jobs-table-wrap">
@ -295,6 +350,34 @@ export function JobsTable({ jobs, appIdFilter, initialParentFilter }: JobsTableP
</label>
)}
{/*
D20 region filter visible only when 2+ regions appear in
the current job set. A single-region Sovereign sees no
dropdown (would be a one-option no-op + visual noise).
Operators on a multi-region cluster get a quick way to
scope the table to fsn1 / hel1-2 / nbg1-1 / sin-2 without
typing the region key into the free-text search.
*/}
{regionOptions.length > 1 ? (
<label className="jobs-filter-label">
<span className="jobs-filter-caption">Region</span>
<select
value={regionFilter}
onChange={(e) => setRegionFilter(e.target.value)}
className="jobs-filter-select"
data-testid="jobs-filter-region"
aria-label="Filter by region"
>
<option value="">All</option>
{regionOptions.map((r) => (
<option key={r} value={r}>
{r}
</option>
))}
</select>
</label>
) : null}
<span
className="jobs-result-count"
data-testid="jobs-result-count"

View File

@ -131,14 +131,27 @@ export function SettingsPage({ disableStream = false }: SettingsPageProps = {})
const startedAt = snapshot?.startedAt ?? null
const status = snapshot?.status ?? null
// Pool domain / subdomain are wizard-store fields; they survive the
// wizard submit because the store is zustand+persist (localStorage).
const poolDomain = store.sovereignPoolDomain || null
const poolSubdomain = store.sovereignSubdomain || null
const domainMode = store.sovereignDomainMode || null
const byoDomain = store.sovereignByoDomain || null
const orgName = store.orgName || null
const orgEmail = store.orgEmail || null
// C8-001 (2026-05-17 t143): prefer the live snapshot for the
// Sovereign + DNS fields, fall back to the wizard store. The chroot
// Sovereign console has a fresh localStorage (the wizard runs on
// mothership, the chroot session never persists the store), so
// wizard-store-only fields rendered four em-dashes for Capacity /
// Pool subdomain / BYO domain / CP size. catalyst-api's
// Deployment.State() now surfaces these from the persisted
// RedactedRequest projection — they're the authoritative source on
// every Sovereign post-handover. The wizard-store fallback covers
// the mothership wizard-in-flight case where the snapshot may not
// yet carry the request fields (pre-CreateDeployment).
const poolDomain = snapshot?.sovereignPoolDomain ?? store.sovereignPoolDomain ?? null
const poolSubdomain = snapshot?.sovereignSubdomain ?? store.sovereignSubdomain ?? null
const domainMode = snapshot?.sovereignDomainMode ?? store.sovereignDomainMode ?? null
const byoDomain = snapshot?.sovereignByoDomain ?? store.sovereignByoDomain ?? null
const orgName = snapshot?.orgName ?? store.orgName ?? null
const orgEmail = snapshot?.orgEmail ?? store.orgEmail ?? null
// OrgIndustry / OrgHeadquarters are wizard-store-only fields today —
// not persisted on the deployment record. They render the em-dash
// placeholder on the chroot until a future PR plumbs them through
// the provisioner.Request payload.
const orgIndustry = store.orgIndustry || null
const orgHeadquarters = store.orgHeadquarters || null
@ -146,11 +159,23 @@ export function SettingsPage({ disableStream = false }: SettingsPageProps = {})
// since the founder spec is single-region happy path. The full per-
// region table belongs on a future Compute settings sub-page.
const controlPlaneSize = useMemo(() => {
const arr = store.regionControlPlaneSizes
if (Array.isArray(arr) && arr.length > 0 && arr[0]) return arr[0]
// Prefer snapshot (chroot Sovereign source-of-truth). Multi-region
// arrays surface from snapshot.regionControlPlaneSizes; single
// region from snapshot.controlPlaneSize. Falls back to wizard
// store for the mothership wizard-in-flight case.
const snapArr = snapshot?.regionControlPlaneSizes
if (Array.isArray(snapArr) && snapArr.length > 0 && snapArr[0]) return snapArr[0]
if (snapshot?.controlPlaneSize) return snapshot.controlPlaneSize
const storeArr = store.regionControlPlaneSizes
if (Array.isArray(storeArr) && storeArr.length > 0 && storeArr[0]) return storeArr[0]
if (store.controlPlaneSize) return store.controlPlaneSize
return null
}, [store.regionControlPlaneSizes, store.controlPlaneSize])
}, [
snapshot?.regionControlPlaneSizes,
snapshot?.controlPlaneSize,
store.regionControlPlaneSizes,
store.controlPlaneSize,
])
return (
<PortalShell deploymentId={deploymentId} sovereignFQDN={sovereignFQDN} pageTitle="Settings">

View File

@ -77,6 +77,25 @@ export interface DeploymentSnapshot {
region?: string
error?: string
numEvents?: number
/**
* C8-001 (2026-05-17 t143) Sovereign-provisioning request fields
* lifted to the snapshot so the chroot's `/sovereign/settings` page
* works without a populated wizard store (chroot localStorage is
* fresh post-handover, so reading Capacity / Pool subdomain / BYO
* domain from `useWizardStore()` rendered four em-dashes). The
* catalyst-api's `Deployment.State()` surfaces these from the
* persisted RedactedRequest projection; the SettingsPage reads
* snapshot-first with the wizard store as fallback.
*/
controlPlaneSize?: string
regionControlPlaneSizes?: string[]
sovereignPoolDomain?: string
sovereignSubdomain?: string
sovereignDomainMode?: string
/** Present only when domainMode === 'byo'. */
sovereignByoDomain?: string
orgName?: string
orgEmail?: string
/**
* Phase-1 helmwatch ground-truth populated by the catalyst-api when
* its HelmRelease informer terminated. Lifted to the top level by