473a2ba4b9
2189 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
473a2ba4b9 |
deploy: update catalyst images to 52be4d4
|
||
|
|
52be4d4d3a
|
fix(catalyst-api): D16 PR H — resolveChrootClusterID multi-cluster + dashboard alias (#1587)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign Founder caught on t136: console.t136.../app/bp-alloy renders the catalog grid (AppsPage) instead of AppDetail. Three earlier PRs (#1572 + chart bumps) flipped the appRoute beforeLoad logic but the actual route-matching collision was not fixed. Root cause: appRoute.addChildren registers appDeploymentRoute at `/$deploymentId` (effective `/app/$deploymentId`, mother-only) BEFORE consoleLayoutRoute registers consoleAppDetailRoute at `/app/$componentId`. TanStack Router resolves equally-specific dynamic routes by declaration order — so on the Sovereign Console URL `/app/bp-alloy` matches appDeploymentRoute first and renders AppsPage with deploymentId="bp-alloy". Fix: at routeTree build time, filter appRoute children to exclude every mother-only `/$deploymentId/*` route when running on Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this is a one-time check, no runtime overhead. With those routes absent, consoleAppDetailRoute is the only matcher for `/app/<componentId>` on Sovereign Console — AppDetail renders. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148 Founder-flagged bug fixes from session t136/t138/t139 verify cycle shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148 ( |
||
|
|
8c1ccfae07
|
chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148 (#1586)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign Founder caught on t136: console.t136.../app/bp-alloy renders the catalog grid (AppsPage) instead of AppDetail. Three earlier PRs (#1572 + chart bumps) flipped the appRoute beforeLoad logic but the actual route-matching collision was not fixed. Root cause: appRoute.addChildren registers appDeploymentRoute at `/$deploymentId` (effective `/app/$deploymentId`, mother-only) BEFORE consoleLayoutRoute registers consoleAppDetailRoute at `/app/$componentId`. TanStack Router resolves equally-specific dynamic routes by declaration order — so on the Sovereign Console URL `/app/bp-alloy` matches appDeploymentRoute first and renders AppsPage with deploymentId="bp-alloy". Fix: at routeTree build time, filter appRoute children to exclude every mother-only `/$deploymentId/*` route when running on Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this is a one-time check, no runtime overhead. With those routes absent, consoleAppDetailRoute is the only matcher for `/app/<componentId>` on Sovereign Console — AppDetail renders. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148 Founder-flagged bug fixes from session t136/t138/t139 verify cycle shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148 ( |
||
|
|
b61e9afabf |
deploy: update catalyst images to 2ab8a0e
|
||
|
|
2ab8a0e653
|
fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign (#1585)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign Founder caught on t136: console.t136.../app/bp-alloy renders the catalog grid (AppsPage) instead of AppDetail. Three earlier PRs (#1572 + chart bumps) flipped the appRoute beforeLoad logic but the actual route-matching collision was not fixed. Root cause: appRoute.addChildren registers appDeploymentRoute at `/$deploymentId` (effective `/app/$deploymentId`, mother-only) BEFORE consoleLayoutRoute registers consoleAppDetailRoute at `/app/$componentId`. TanStack Router resolves equally-specific dynamic routes by declaration order — so on the Sovereign Console URL `/app/bp-alloy` matches appDeploymentRoute first and renders AppsPage with deploymentId="bp-alloy". Fix: at routeTree build time, filter appRoute children to exclude every mother-only `/$deploymentId/*` route when running on Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this is a one-time check, no runtime overhead. With those routes absent, consoleAppDetailRoute is the only matcher for `/app/<componentId>` on Sovereign Console — AppDetail renders. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d985f27c8b |
deploy: update sme service images to 964dc15 + bump chart to 1.4.148
|
||
|
|
964dc15570
|
fix(catalog): D27 — fresh-seed apps default Published+Deployable (#1584)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f7ea19000e |
deploy: update catalyst images to 9fc2850
|
||
|
|
9fc2850504
|
fix(catalyst-api): D16/D17 — 3 bugs caught on t138 fresh prov (#1583)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ccbe51e3e4 |
deploy: update catalyst images to 9237c1e
|
||
|
|
9237c1e6ee
|
fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) (#1582)
PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ce4ef6ba98
|
feat(handover): export secondary kubeconfigs to chroot at handover (D16 PR B) (#1581)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)
D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.
Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.
This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
(canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract
PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.
PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C)
When group_by includes "cluster" or "region", enumerate ALL registered
k8sCache clusters (primary + secondaries synced via PR #1579's POST
/api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate
podRows from each before aggregation.
Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region
Sovereign (was 1 bubble before).
For group_by that ONLY contains {namespace,family,application,vcluster,
sovereign} the primary clusterID's pods are sufficient and faster — no
fan-out cost.
PR B (mothership-side handover hook to POST each secondary kubeconfig)
will complete the chain. Until then, secondaries don't appear in
k8sCache.Clusters() so this fan-out is a no-op on existing provs — but
the code is in place for when PR B lands.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(handover): export secondary kubeconfigs to chroot at handover (D16 PR B)
Closes the D16 multi-cluster fan-out chain:
- PR #1579 (PR A): chroot endpoint accepts kubeconfigs
- PR #1580 (PR C): dashboard handler fans out across registered clusters
- This PR (PR B): mothership-side hook iterates secondary regions at
handover, reads each region's kubeconfig from the mothership PVC,
and POSTs to the chroot's endpoint
After handover-fire, exportSecondaryKubeconfigsToChild fires as a
goroutine (alongside exportDeploymentToChild). Best-effort per region:
a failure on region N doesn't abort N+1.
The chroot's k8sCache.Factory.AddCluster runs on every POST so
dashboard /api/v1/dashboard/treemap?group_by=cluster|region now
enumerates pods from all N regions and Layer-1=Cluster renders N
bubbles on an N-region Sovereign.
regionKeysForExport derives the filename convention `<region>-<slot>`
from dep.Request.Regions[1:] (primary is auto-registered by the
chroot's FactoryFromEnv self-registration so we skip index 0).
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are read with stdlib os.ReadFile.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b07e5206a1 |
deploy: update catalyst images to d92f734
|
||
|
|
d92f734374
|
feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C) (#1580)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)
D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.
Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.
This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
(canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract
PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.
PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C)
When group_by includes "cluster" or "region", enumerate ALL registered
k8sCache clusters (primary + secondaries synced via PR #1579's POST
/api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate
podRows from each before aggregation.
Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region
Sovereign (was 1 bubble before).
For group_by that ONLY contains {namespace,family,application,vcluster,
sovereign} the primary clusterID's pods are sufficient and faster — no
fan-out cost.
PR B (mothership-side handover hook to POST each secondary kubeconfig)
will complete the chain. Until then, secondaries don't appear in
k8sCache.Clusters() so this fan-out is a no-op on existing provs — but
the code is in place for when PR B lands.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
bcab6430cb
|
feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A) (#1579)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)
D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.
Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.
This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
(canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract
PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.
PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6e329e27ae |
deploy: update catalyst images to 4f62dd2
|
||
|
|
4f62dd21b3
|
fix(handover): D21 owner seed uses tierRoleRef not wildcard app (#1578)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6466f97f6c |
deploy: update catalyst images to ea30ded
|
||
|
|
ea30ded120
|
fix(handover): D21 owner seed uses catalyst-system namespace (#1577)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
18b5fa1466 |
deploy: update catalyst images to 33ed484
|
||
|
|
33ed484e04
|
fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) (#1576)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a65a024114 |
deploy: update catalyst images to c148ec6
|
||
|
|
c148ec6a34
|
fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) (#1575)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c5f777056f |
deploy: update catalyst images to 3568b72
|
||
|
|
3568b72b5e
|
fix(cloud): hide non-active 0/0 chips (D15) (#1574)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) PR #1552 stripped the `/app` prefix on Sovereign mode to make `/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match. But consoleAppDetailRoute is registered at `/app/$componentId` under consoleLayoutRoute — no chroot route matches `/<componentId>` directly, so stripping leaves an empty render path. Playwright walkthrough on t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render body_len=9 (empty). Invert the logic: only redirect mothership-only sub-paths (/dashboard Fleet view, /install wizard, /sre, /sec, /blueprints) which have no Sovereign Console equivalent. For everything else (component names like `/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match pick consoleAppDetailRoute / consoleAppsRoute. Caught live on t132 via Playwright walker3.js — agent a4825c5a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): re-mint handover JWT on every GetDeployment (D0) D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL persisted at handover-fire time carries a JWT that expires per DefaultTTL (5min). Operators who click /jobs hours later get the stale token → Sovereign-side /auth/handover rejects with raw JSON {"error":"invalid token"} — no UI fallback, no /auth/handover-error, auto-redirect to /dashboard never fires. Re-mint the JWT on every GetDeployment when deployment is ready + handover-fired so the URL returned to the wizard is always freshly-signed. Best-effort: on mint failure, leave the existing URL in place so a transient signer error doesn't break polling. Helper is idempotent + locked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cloud): hide non-active 0/0 chips (D15) Playwright walkthrough on t132 2026-05-17 caught D15 PARTIAL: 15 chips are correct but Bucket+Volume show 0/0. Founder rule (DoD D15): "No kind chip shows 0/0 for a resource that actually exists in the cluster". Bucket+Volume genuinely don't exist on this Sovereign so showing 0/0 is noise. Hide chips with count exactly 0 unless they're the active selection (operator who navigated to an empty kind keeps context). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
44e612f39d |
deploy: update catalyst images to 58dbb92
|
||
|
|
58dbb92f4f
|
fix(handover): re-mint handover JWT on every GetDeployment (D0) (#1573)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) PR #1552 stripped the `/app` prefix on Sovereign mode to make `/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match. But consoleAppDetailRoute is registered at `/app/$componentId` under consoleLayoutRoute — no chroot route matches `/<componentId>` directly, so stripping leaves an empty render path. Playwright walkthrough on t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render body_len=9 (empty). Invert the logic: only redirect mothership-only sub-paths (/dashboard Fleet view, /install wizard, /sre, /sec, /blueprints) which have no Sovereign Console equivalent. For everything else (component names like `/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match pick consoleAppDetailRoute / consoleAppsRoute. Caught live on t132 via Playwright walker3.js — agent a4825c5a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): re-mint handover JWT on every GetDeployment (D0) D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL persisted at handover-fire time carries a JWT that expires per DefaultTTL (5min). Operators who click /jobs hours later get the stale token → Sovereign-side /auth/handover rejects with raw JSON {"error":"invalid token"} — no UI fallback, no /auth/handover-error, auto-redirect to /dashboard never fires. Re-mint the JWT on every GetDeployment when deployment is ready + handover-fired so the URL returned to the wizard is always freshly-signed. Best-effort: on mint failure, leave the existing URL in place so a transient signer error doesn't break polling. Helper is idempotent + locked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
dea683f5e4 |
deploy: update catalyst images to 9e1e422
|
||
|
|
9e1e4224d8
|
fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) (#1572)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) PR #1552 stripped the `/app` prefix on Sovereign mode to make `/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match. But consoleAppDetailRoute is registered at `/app/$componentId` under consoleLayoutRoute — no chroot route matches `/<componentId>` directly, so stripping leaves an empty render path. Playwright walkthrough on t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render body_len=9 (empty). Invert the logic: only redirect mothership-only sub-paths (/dashboard Fleet view, /install wizard, /sre, /sec, /blueprints) which have no Sovereign Console equivalent. For everything else (component names like `/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match pick consoleAppDetailRoute / consoleAppsRoute. Caught live on t132 via Playwright walker3.js — agent a4825c5a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4cc880cafd |
deploy: update catalyst images to 5793958
|
||
|
|
57939585c0
|
feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) (#1571)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
700d28967f
|
chore(slot-13): add D22 sovereign-side identity placeholders (#1570)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
df193d340e |
deploy: update catalyst images to 9cbcd23
|
||
|
|
9cbcd230da
|
feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) (#1569)
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0e0280bbe0 |
deploy: update catalyst images to 6618392
|
||
|
|
6618392407
|
fix(chroot): GetDeployment falls back to chrootEnsureDeployment (D22) (#1568)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)
Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.
After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:
apiVersion: access.openova.io/v1alpha1
kind: UserAccess
metadata:
name: useraccess-owner-<sanitized-email>
annotations:
catalyst.openova.io/user-email: <email> # rbac_matrix:309 hint
spec:
user:
keycloakSubject: <email>
sovereignRef: <fqdn-first-label>
applications:
- app: "*"
role: admin # owner -> admin
The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.
Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.
Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).
Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21
* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)
PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5)
PR #1551 single-quoted SOVEREIGN_REGIONS_JSON in the slot file
substitute, but Flux Kustomize's postBuild can still re-parse the
JSON-shaped string as a YAML flow-sequence depending on quoting context.
When that happens .Values.sovereign.regionsJson is a Go []interface{}
of map[interface{}]interface{} and `| quote` prints Go's
`[map[cloudRegion:hel1 ...]]` syntax — catalyst-api's json.Unmarshal of
the env var then fails and Request.Regions is empty.
toJson normalises both string and list inputs to valid JSON.
Caught live on t132 2026-05-16 chart 1.4.147: env var rendered as
`[map[cloudRegion:hel1 ...]]` despite #1551 being in effect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chroot): populate deployment Result + Request fields for D22
Settings page on Sovereign Console renders `—` for Region / Sovereign /
Created / DeploymentID / Pool subdomain because chroot's GET
/api/v1/deployments/<id> returns empty strings for those fields.
Populate from existing env vars (best-effort — empty when chart hasn't
wired them yet, which is no worse than today's behaviour):
- Result.ConsoleURL = "https://console.<fqdn>" (derived from selfFQDN)
- Result.GitOpsRepoURL from GITOPS_REPO_URL env
- Result.ControlPlaneIP from SOVEREIGN_CONTROL_PLANE_IP env
- Request.Region = regions[0].CloudRegion (top-level legacy field)
- Request.OrgEmail from OPERATOR_EMAIL env
- Request.OrgName from ORG_NAME env
Companion chart PR will wire the env vars from .Values.global.* +
cloud-init substitute placeholders. This PR is BACKWARD-compatible —
unset env vars produce empty strings, same as today.
Caught live on t132 2026-05-16 — `curl /api/v1/deployments/sovereign-
t132.omani.works` returns empty ownerEmail/region/consoleURL.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chroot): GetDeployment falls back to chrootEnsureDeployment (D22)
GetDeployment was the only handler that returned 404 without calling
chrootEnsureDeployment. After a catalyst-api Pod restart on the chroot
the in-memory store is empty until some other handler (StreamLogs,
jobs list) primes it via its own synth call — meanwhile the Sovereign
Console Settings page loads /api/v1/deployments/<id> first and gets
404, rendering the entire page broken.
Mirror the StreamLogs pattern (lines 1247-1254): try in-memory load,
fall through to chrootEnsureDeployment, return 404 only when both miss.
This unblocks PR #1567's deployment-record population — without the
fallback, GetDeployment can never serve the populated record on chroot.
Caught live on t132 2026-05-16 after #1567 image roll: Settings page
404 because in-memory store was empty.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b094a354b7 |
deploy: update catalyst images to ed63ecd
|
||
|
|
ed63ecd09f
|
fix(chroot): populate deployment Result + Request fields for D22 settings (#1567)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)
Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.
After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:
apiVersion: access.openova.io/v1alpha1
kind: UserAccess
metadata:
name: useraccess-owner-<sanitized-email>
annotations:
catalyst.openova.io/user-email: <email> # rbac_matrix:309 hint
spec:
user:
keycloakSubject: <email>
sovereignRef: <fqdn-first-label>
applications:
- app: "*"
role: admin # owner -> admin
The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.
Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.
Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).
Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21
* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)
PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5)
PR #1551 single-quoted SOVEREIGN_REGIONS_JSON in the slot file
substitute, but Flux Kustomize's postBuild can still re-parse the
JSON-shaped string as a YAML flow-sequence depending on quoting context.
When that happens .Values.sovereign.regionsJson is a Go []interface{}
of map[interface{}]interface{} and `| quote` prints Go's
`[map[cloudRegion:hel1 ...]]` syntax — catalyst-api's json.Unmarshal of
the env var then fails and Request.Regions is empty.
toJson normalises both string and list inputs to valid JSON.
Caught live on t132 2026-05-16 chart 1.4.147: env var rendered as
`[map[cloudRegion:hel1 ...]]` despite #1551 being in effect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chroot): populate deployment Result + Request fields for D22
Settings page on Sovereign Console renders `—` for Region / Sovereign /
Created / DeploymentID / Pool subdomain because chroot's GET
/api/v1/deployments/<id> returns empty strings for those fields.
Populate from existing env vars (best-effort — empty when chart hasn't
wired them yet, which is no worse than today's behaviour):
- Result.ConsoleURL = "https://console.<fqdn>" (derived from selfFQDN)
- Result.GitOpsRepoURL from GITOPS_REPO_URL env
- Result.ControlPlaneIP from SOVEREIGN_CONTROL_PLANE_IP env
- Request.Region = regions[0].CloudRegion (top-level legacy field)
- Request.OrgEmail from OPERATOR_EMAIL env
- Request.OrgName from ORG_NAME env
Companion chart PR will wire the env vars from .Values.global.* +
cloud-init substitute placeholders. This PR is BACKWARD-compatible —
unset env vars produce empty strings, same as today.
Caught live on t132 2026-05-16 — `curl /api/v1/deployments/sovereign-
t132.omani.works` returns empty ownerEmail/region/consoleURL.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d82e06bfe9 |
deploy: update catalyst images to 0a45fb0
|
||
|
|
0a45fb0449
|
fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5) (#1566)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)
Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.
After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:
apiVersion: access.openova.io/v1alpha1
kind: UserAccess
metadata:
name: useraccess-owner-<sanitized-email>
annotations:
catalyst.openova.io/user-email: <email> # rbac_matrix:309 hint
spec:
user:
keycloakSubject: <email>
sovereignRef: <fqdn-first-label>
applications:
- app: "*"
role: admin # owner -> admin
The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.
Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.
Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).
Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21
* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)
PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5)
PR #1551 single-quoted SOVEREIGN_REGIONS_JSON in the slot file
substitute, but Flux Kustomize's postBuild can still re-parse the
JSON-shaped string as a YAML flow-sequence depending on quoting context.
When that happens .Values.sovereign.regionsJson is a Go []interface{}
of map[interface{}]interface{} and `| quote` prints Go's
`[map[cloudRegion:hel1 ...]]` syntax — catalyst-api's json.Unmarshal of
the env var then fails and Request.Regions is empty.
toJson normalises both string and list inputs to valid JSON.
Caught live on t132 2026-05-16 chart 1.4.147: env var rendered as
`[map[cloudRegion:hel1 ...]]` despite #1551 being in effect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
3f8e2b925e
|
chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked) (#1565)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)
Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.
After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:
apiVersion: access.openova.io/v1alpha1
kind: UserAccess
metadata:
name: useraccess-owner-<sanitized-email>
annotations:
catalyst.openova.io/user-email: <email> # rbac_matrix:309 hint
spec:
user:
keycloakSubject: <email>
sovereignRef: <fqdn-first-label>
applications:
- app: "*"
role: admin # owner -> admin
The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.
Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.
Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).
Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21
* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)
PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f8c8a87151 |
deploy: update catalyst images to 8d2a947
|
||
|
|
8d2a947cfb
|
feat(handover): auto-seed owner UserAccess CR on chroot (D21) (#1564)
Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.
After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:
apiVersion: access.openova.io/v1alpha1
kind: UserAccess
metadata:
name: useraccess-owner-<sanitized-email>
annotations:
catalyst.openova.io/user-email: <email> # rbac_matrix:309 hint
spec:
user:
keycloakSubject: <email>
sovereignRef: <fqdn-first-label>
applications:
- app: "*"
role: admin # owner -> admin
The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.
Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.
Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).
Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
|
||
|
|
5510ab91f9
|
chore(slot-13): pin bp-catalyst-platform to 1.4.146 (D29 billing JWT bypass) (#1563)
PR #1561 added billing-service JWT exemptions matching gateway public routes (D29 voucher-redeem zero-touch). Pin slot so future provisions inherit the full D29 unblocker chain. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d6b6aca581 |
deploy: update sme service images to c04b2ec + bump chart to 1.4.147
|
||
|
|
c04b2ec76d
|
feat(wordpress-tenant): activeHotStandby option wires bp-cnpg-pair (D31) (#1562)
Sovereign DoD D31 — tenants subscribing to an HA-capable marketplace app may opt into a cross-region active-hot-standby Postgres pair for their WordPress instance instead of the default single CNPG Cluster. Mirrors the canonical bp-cnpg-pair pattern (primary + replica Cluster CRs with WAL streaming over Cilium ClusterMesh via a managed Service annotated service.cilium.io/global=true). When the new pg.activeHotStandby.enabled flag is false (default), templates render the existing single Cluster bit-for-bit — no regression for non-HA tenants. Catalog seed flags WordPress with ha + cnpg-pair tags so the marketplace HA filter can surface it. Chart bumped 0.2.1 -> 0.3.0. New render-gate test asserts both default single-cluster shape AND the enabled 2-Cluster shape with the right nodeSelectors, replica.source, externalCluster.host, Cilium global annotation, and bootstrap.pg_basebackup; all 5 cases pass. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
af4d9b1b87 |
deploy: update sme service images to f9ed292 + bump chart to 1.4.146
|
||
|
|
f9ed292198
|
fix(billing): /redeem-preview + plans + addons bypass JWT (D29) (#1561)
* chore(slot-13): pin bp-catalyst-platform to 1.4.145 (D29 gateway public routes) PR #1559 added /api/billing/{vouchers/redeem-preview,plans,addons} as public gateway routes — required for the marketplace /redeem zero-touch flow. Pin the slot so future provisions inherit it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(billing): /redeem-preview + plans + addons bypass JWT (D29) Mirror PR #1559's gateway public routes in the billing service's own middleware chain. The gateway now lets these requests through without an Authorization header (D29 voucher-redeem landing), but billing service's main.go was JWT-gating EVERY /billing/* path except /billing/webhook — so the request still got 401, just one hop later. Caught live on t132 2026-05-16 after PR #1559 rolled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
936e76f79a
|
chore(slot-13): pin bp-catalyst-platform to 1.4.145 (D29 gateway public routes) (#1560)
PR #1559 added /api/billing/{vouchers/redeem-preview,plans,addons} as public gateway routes — required for the marketplace /redeem zero-touch flow. Pin the slot so future provisions inherit it. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
696aa26f83 |
deploy: update sme service images to a11067d + bump chart to 1.4.145
|