Commit Graph

2189 Commits

Author SHA1 Message Date
github-actions[bot]
473a2ba4b9 deploy: update catalyst images to 52be4d4 2026-05-17 07:02:25 +00:00
e3mrah
52be4d4d3a
fix(catalyst-api): D16 PR H — resolveChrootClusterID multi-cluster + dashboard alias (#1587)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)

PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:

  internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
  internal/handler/deployment_handover_export.go:199:6: other declaration of itoa

Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-api): D16/D17 — 3 bugs caught on t138

Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.

1. exportSecondaryKubeconfigsToChild was guarded behind the early
   return of exportDeploymentToChild's failed POST. The child's
   ingress + cert + gateway are still racing to reach reachable
   state in the seconds after handover fires, so the first POST
   gets EOF and the goroutine never fires. Fix: kick off the
   D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
   in its own goroutine, BEFORE the deployment-record POST.

2. Both exports now retry with exponential backoff (5s → 60s) for
   up to 5 min total. Most handovers will succeed on attempt 2-4.
   Was: no retry, single shot, silent failure.

3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
   auth group (rg) into the top-level router (r), alongside
   /api/v1/internal/deployments/import. The previous registration
   required an operator session that doesn't exist at handover —
   mothership POSTs were 401'd silently. Validation is now via
   safeIDPattern regex on depID + regionKey (same security model
   as the deployments/import companion endpoint).

4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
   instead of using only the in-cluster client. Adds Cluster
   field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
   group/filter by region. Without this, /cloud?view=list&kind=nodes
   shows 1 node even when 3 secondary kubeconfigs are registered.

Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalog): D27 — fresh-seed apps default Published+Deployable

Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.

Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.

Verified the bug live on t138 (eaaee1ea24184c2a):
  http://catalog.sme:8082/catalog/apps returns 27 apps
  http://catalog.sme:8082/catalog/apps?published=true returns 0

With this fix the latter returns 27.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign

Founder caught on t136: console.t136.../app/bp-alloy renders the
catalog grid (AppsPage) instead of AppDetail. Three earlier PRs
(#1572 + chart bumps) flipped the appRoute beforeLoad logic but
the actual route-matching collision was not fixed.

Root cause: appRoute.addChildren registers appDeploymentRoute at
`/$deploymentId` (effective `/app/$deploymentId`, mother-only)
BEFORE consoleLayoutRoute registers consoleAppDetailRoute at
`/app/$componentId`. TanStack Router resolves equally-specific
dynamic routes by declaration order — so on the Sovereign Console
URL `/app/bp-alloy` matches appDeploymentRoute first and renders
AppsPage with deploymentId="bp-alloy".

Fix: at routeTree build time, filter appRoute children to exclude
every mother-only `/$deploymentId/*` route when running on
Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this
is a one-time check, no runtime overhead. With those routes
absent, consoleAppDetailRoute is the only matcher for
`/app/<componentId>` on Sovereign Console — AppDetail renders.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148

Founder-flagged bug fixes from session t136/t138/t139 verify cycle
shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148
(d985f27c) with new images:
- catalystApi/Ui: 2ab8a0e (PR #1583 D16 fan-out + retry + auth-bypass,
  PR #1585 D17 router collision)
- smeTag: 964dc15 (PR #1584 D27 catalog fresh-seed Published)

But bootstrap-kit/13-bp-catalyst-platform.yaml stayed pinned to
1.4.147 — every fresh provision installs the OLDER chart with the
OLDER images, so the founder-flagged bugs persist.

Caught on t139 (b4a7ee052d844da0) post-handover verify: chart
installed = bp-catalyst-platform@1.4.147, catalog returns 0
published apps, /app/bp-alloy renders catalog grid.

Bumping the pin makes fresh provs install 1.4.148 (which has all 3
PRs baked).

Refs: feedback_test_theater_3rd_violation_2026_05_17.md
      feedback_overlap_provs_dont_serialize_wait.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-api): D16 PR H — resolveChrootClusterID multi-cluster + dashboard alias

Founder caught on t140 (29b7e14918178f7e) after D16 fan-out chain shipped:
- /dashboard is empty (no treemap rendered)
- "none of the k8s resources are streaming"

Root cause: after the D16 secondary-kubeconfig export (PR #1579/#1581)
landed, chroot's k8sCache went from 1 cluster (primary self-register)
to 3 clusters (primary + 2 secondaries). Two cascading bugs:

1. resolveChrootClusterID had a `len(clusters) != 1` guard — it only
   aliased when chroot had exactly one cluster. After D16 it returned
   the URL deployment_id unchanged → has-cluster check failed →
   every chroot handler (networking, k8s_search, k8s_resource_metrics,
   k8s_exec, dashboard) saw "not found" → returned empty.

2. dashboard.go::GetDashboardTreemap was the one chroot handler that
   didn't call resolveChrootClusterID before the has-cluster check —
   so even with #1 fixed, the dashboard would still 404.

Fix:
- resolveChrootClusterID: when N>1, prefer the cluster whose id is
  prefixed "sovereign-" (the FactoryFromEnv self-registered primary
  per buildChrootClusterRef). Falls back to clusters[0] if no match.
- GetDashboardTreemap: call resolveChrootClusterID before has-cluster
  check, matching the pattern in every other chroot handler.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md (don't ship
D16 fan-out without verifying every handler that depends on
single-cluster k8sCache assumption).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 10:59:43 +04:00
e3mrah
8c1ccfae07
chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148 (#1586)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)

PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:

  internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
  internal/handler/deployment_handover_export.go:199:6: other declaration of itoa

Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-api): D16/D17 — 3 bugs caught on t138

Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.

1. exportSecondaryKubeconfigsToChild was guarded behind the early
   return of exportDeploymentToChild's failed POST. The child's
   ingress + cert + gateway are still racing to reach reachable
   state in the seconds after handover fires, so the first POST
   gets EOF and the goroutine never fires. Fix: kick off the
   D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
   in its own goroutine, BEFORE the deployment-record POST.

2. Both exports now retry with exponential backoff (5s → 60s) for
   up to 5 min total. Most handovers will succeed on attempt 2-4.
   Was: no retry, single shot, silent failure.

3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
   auth group (rg) into the top-level router (r), alongside
   /api/v1/internal/deployments/import. The previous registration
   required an operator session that doesn't exist at handover —
   mothership POSTs were 401'd silently. Validation is now via
   safeIDPattern regex on depID + regionKey (same security model
   as the deployments/import companion endpoint).

4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
   instead of using only the in-cluster client. Adds Cluster
   field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
   group/filter by region. Without this, /cloud?view=list&kind=nodes
   shows 1 node even when 3 secondary kubeconfigs are registered.

Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalog): D27 — fresh-seed apps default Published+Deployable

Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.

Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.

Verified the bug live on t138 (eaaee1ea24184c2a):
  http://catalog.sme:8082/catalog/apps returns 27 apps
  http://catalog.sme:8082/catalog/apps?published=true returns 0

With this fix the latter returns 27.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign

Founder caught on t136: console.t136.../app/bp-alloy renders the
catalog grid (AppsPage) instead of AppDetail. Three earlier PRs
(#1572 + chart bumps) flipped the appRoute beforeLoad logic but
the actual route-matching collision was not fixed.

Root cause: appRoute.addChildren registers appDeploymentRoute at
`/$deploymentId` (effective `/app/$deploymentId`, mother-only)
BEFORE consoleLayoutRoute registers consoleAppDetailRoute at
`/app/$componentId`. TanStack Router resolves equally-specific
dynamic routes by declaration order — so on the Sovereign Console
URL `/app/bp-alloy` matches appDeploymentRoute first and renders
AppsPage with deploymentId="bp-alloy".

Fix: at routeTree build time, filter appRoute children to exclude
every mother-only `/$deploymentId/*` route when running on
Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this
is a one-time check, no runtime overhead. With those routes
absent, consoleAppDetailRoute is the only matcher for
`/app/<componentId>` on Sovereign Console — AppDetail renders.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148

Founder-flagged bug fixes from session t136/t138/t139 verify cycle
shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148
(d985f27c) with new images:
- catalystApi/Ui: 2ab8a0e (PR #1583 D16 fan-out + retry + auth-bypass,
  PR #1585 D17 router collision)
- smeTag: 964dc15 (PR #1584 D27 catalog fresh-seed Published)

But bootstrap-kit/13-bp-catalyst-platform.yaml stayed pinned to
1.4.147 — every fresh provision installs the OLDER chart with the
OLDER images, so the founder-flagged bugs persist.

Caught on t139 (b4a7ee052d844da0) post-handover verify: chart
installed = bp-catalyst-platform@1.4.147, catalog returns 0
published apps, /app/bp-alloy renders catalog grid.

Bumping the pin makes fresh provs install 1.4.148 (which has all 3
PRs baked).

Refs: feedback_test_theater_3rd_violation_2026_05_17.md
      feedback_overlap_provs_dont_serialize_wait.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 10:15:16 +04:00
github-actions[bot]
b61e9afabf deploy: update catalyst images to 2ab8a0e 2026-05-17 05:37:01 +00:00
e3mrah
2ab8a0e653
fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign (#1585)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)

PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:

  internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
  internal/handler/deployment_handover_export.go:199:6: other declaration of itoa

Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-api): D16/D17 — 3 bugs caught on t138

Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.

1. exportSecondaryKubeconfigsToChild was guarded behind the early
   return of exportDeploymentToChild's failed POST. The child's
   ingress + cert + gateway are still racing to reach reachable
   state in the seconds after handover fires, so the first POST
   gets EOF and the goroutine never fires. Fix: kick off the
   D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
   in its own goroutine, BEFORE the deployment-record POST.

2. Both exports now retry with exponential backoff (5s → 60s) for
   up to 5 min total. Most handovers will succeed on attempt 2-4.
   Was: no retry, single shot, silent failure.

3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
   auth group (rg) into the top-level router (r), alongside
   /api/v1/internal/deployments/import. The previous registration
   required an operator session that doesn't exist at handover —
   mothership POSTs were 401'd silently. Validation is now via
   safeIDPattern regex on depID + regionKey (same security model
   as the deployments/import companion endpoint).

4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
   instead of using only the in-cluster client. Adds Cluster
   field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
   group/filter by region. Without this, /cloud?view=list&kind=nodes
   shows 1 node even when 3 secondary kubeconfigs are registered.

Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalog): D27 — fresh-seed apps default Published+Deployable

Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.

Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.

Verified the bug live on t138 (eaaee1ea24184c2a):
  http://catalog.sme:8082/catalog/apps returns 27 apps
  http://catalog.sme:8082/catalog/apps?published=true returns 0

With this fix the latter returns 27.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign

Founder caught on t136: console.t136.../app/bp-alloy renders the
catalog grid (AppsPage) instead of AppDetail. Three earlier PRs
(#1572 + chart bumps) flipped the appRoute beforeLoad logic but
the actual route-matching collision was not fixed.

Root cause: appRoute.addChildren registers appDeploymentRoute at
`/$deploymentId` (effective `/app/$deploymentId`, mother-only)
BEFORE consoleLayoutRoute registers consoleAppDetailRoute at
`/app/$componentId`. TanStack Router resolves equally-specific
dynamic routes by declaration order — so on the Sovereign Console
URL `/app/bp-alloy` matches appDeploymentRoute first and renders
AppsPage with deploymentId="bp-alloy".

Fix: at routeTree build time, filter appRoute children to exclude
every mother-only `/$deploymentId/*` route when running on
Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this
is a one-time check, no runtime overhead. With those routes
absent, consoleAppDetailRoute is the only matcher for
`/app/<componentId>` on Sovereign Console — AppDetail renders.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 09:34:01 +04:00
github-actions[bot]
d985f27c8b deploy: update sme service images to 964dc15 + bump chart to 1.4.148 2026-05-17 05:29:35 +00:00
e3mrah
964dc15570
fix(catalog): D27 — fresh-seed apps default Published+Deployable (#1584)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)

PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:

  internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
  internal/handler/deployment_handover_export.go:199:6: other declaration of itoa

Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-api): D16/D17 — 3 bugs caught on t138

Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.

1. exportSecondaryKubeconfigsToChild was guarded behind the early
   return of exportDeploymentToChild's failed POST. The child's
   ingress + cert + gateway are still racing to reach reachable
   state in the seconds after handover fires, so the first POST
   gets EOF and the goroutine never fires. Fix: kick off the
   D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
   in its own goroutine, BEFORE the deployment-record POST.

2. Both exports now retry with exponential backoff (5s → 60s) for
   up to 5 min total. Most handovers will succeed on attempt 2-4.
   Was: no retry, single shot, silent failure.

3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
   auth group (rg) into the top-level router (r), alongside
   /api/v1/internal/deployments/import. The previous registration
   required an operator session that doesn't exist at handover —
   mothership POSTs were 401'd silently. Validation is now via
   safeIDPattern regex on depID + regionKey (same security model
   as the deployments/import companion endpoint).

4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
   instead of using only the in-cluster client. Adds Cluster
   field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
   group/filter by region. Without this, /cloud?view=list&kind=nodes
   shows 1 node even when 3 secondary kubeconfigs are registered.

Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalog): D27 — fresh-seed apps default Published+Deployable

Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.

Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.

Verified the bug live on t138 (eaaee1ea24184c2a):
  http://catalog.sme:8082/catalog/apps returns 27 apps
  http://catalog.sme:8082/catalog/apps?published=true returns 0

With this fix the latter returns 27.

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 09:28:35 +04:00
github-actions[bot]
f7ea19000e deploy: update catalyst images to 9fc2850 2026-05-17 05:28:28 +00:00
e3mrah
9fc2850504
fix(catalyst-api): D16/D17 — 3 bugs caught on t138 fresh prov (#1583)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)

PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:

  internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
  internal/handler/deployment_handover_export.go:199:6: other declaration of itoa

Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-api): D16/D17 — 3 bugs caught on t138

Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.

1. exportSecondaryKubeconfigsToChild was guarded behind the early
   return of exportDeploymentToChild's failed POST. The child's
   ingress + cert + gateway are still racing to reach reachable
   state in the seconds after handover fires, so the first POST
   gets EOF and the goroutine never fires. Fix: kick off the
   D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
   in its own goroutine, BEFORE the deployment-record POST.

2. Both exports now retry with exponential backoff (5s → 60s) for
   up to 5 min total. Most handovers will succeed on attempt 2-4.
   Was: no retry, single shot, silent failure.

3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
   auth group (rg) into the top-level router (r), alongside
   /api/v1/internal/deployments/import. The previous registration
   required an operator session that doesn't exist at handover —
   mothership POSTs were 401'd silently. Validation is now via
   safeIDPattern regex on depID + regionKey (same security model
   as the deployments/import companion endpoint).

4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
   instead of using only the in-cluster client. Adds Cluster
   field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
   group/filter by region. Without this, /cloud?view=list&kind=nodes
   shows 1 node even when 3 secondary kubeconfigs are registered.

Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 09:26:16 +04:00
github-actions[bot]
ccbe51e3e4 deploy: update catalyst images to 9237c1e 2026-05-17 04:48:41 +00:00
e3mrah
9237c1e6ee
fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) (#1582)
PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:

  internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
  internal/handler/deployment_handover_export.go:199:6: other declaration of itoa

Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 08:45:49 +04:00
e3mrah
ce4ef6ba98
feat(handover): export secondary kubeconfigs to chroot at handover (D16 PR B) (#1581)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)

PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".

Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.

The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)

When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:

  1. Burning a Dynadot API credit on a flip that would be idempotent.
  2. The D30 blocker — current Dynadot creds return pdm-status-401
     even when the desired NS state already exists. Caught on t132
     2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
     parentDomains attempt.

Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.

This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses catalyst-system namespace

PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).

Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.

Verified the CRD shape on t134 2026-05-17:
  $ kubectl api-resources --api-group=access.openova.io
  useraccesses   access.openova.io/v1alpha1   true   UserAccess
                                                ^^^^
                                                NAMESPACED

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses tierRoleRef not wildcard app

PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.

The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.

Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)

D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.

Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.

This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
  depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
  (canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract

PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.

PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.

Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.

Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C)

When group_by includes "cluster" or "region", enumerate ALL registered
k8sCache clusters (primary + secondaries synced via PR #1579's POST
/api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate
podRows from each before aggregation.

Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region
Sovereign (was 1 bubble before).

For group_by that ONLY contains {namespace,family,application,vcluster,
sovereign} the primary clusterID's pods are sufficient and faster — no
fan-out cost.

PR B (mothership-side handover hook to POST each secondary kubeconfig)
will complete the chain. Until then, secondaries don't appear in
k8sCache.Clusters() so this fan-out is a no-op on existing provs — but
the code is in place for when PR B lands.

Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handover): export secondary kubeconfigs to chroot at handover (D16 PR B)

Closes the D16 multi-cluster fan-out chain:
- PR #1579 (PR A): chroot endpoint accepts kubeconfigs
- PR #1580 (PR C): dashboard handler fans out across registered clusters
- This PR (PR B): mothership-side hook iterates secondary regions at
  handover, reads each region's kubeconfig from the mothership PVC,
  and POSTs to the chroot's endpoint

After handover-fire, exportSecondaryKubeconfigsToChild fires as a
goroutine (alongside exportDeploymentToChild). Best-effort per region:
a failure on region N doesn't abort N+1.

The chroot's k8sCache.Factory.AddCluster runs on every POST so
dashboard /api/v1/dashboard/treemap?group_by=cluster|region now
enumerates pods from all N regions and Layer-1=Cluster renders N
bubbles on an N-region Sovereign.

regionKeysForExport derives the filename convention `<region>-<slot>`
from dep.Request.Regions[1:] (primary is auto-registered by the
chroot's FactoryFromEnv self-registration so we skip index 0).

Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are read with stdlib os.ReadFile.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 08:22:01 +04:00
github-actions[bot]
b07e5206a1 deploy: update catalyst images to d92f734 2026-05-17 04:09:34 +00:00
e3mrah
d92f734374
feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C) (#1580)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)

PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".

Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.

The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)

When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:

  1. Burning a Dynadot API credit on a flip that would be idempotent.
  2. The D30 blocker — current Dynadot creds return pdm-status-401
     even when the desired NS state already exists. Caught on t132
     2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
     parentDomains attempt.

Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.

This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses catalyst-system namespace

PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).

Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.

Verified the CRD shape on t134 2026-05-17:
  $ kubectl api-resources --api-group=access.openova.io
  useraccesses   access.openova.io/v1alpha1   true   UserAccess
                                                ^^^^
                                                NAMESPACED

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses tierRoleRef not wildcard app

PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.

The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.

Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)

D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.

Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.

This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
  depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
  (canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract

PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.

PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.

Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.

Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C)

When group_by includes "cluster" or "region", enumerate ALL registered
k8sCache clusters (primary + secondaries synced via PR #1579's POST
/api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate
podRows from each before aggregation.

Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region
Sovereign (was 1 bubble before).

For group_by that ONLY contains {namespace,family,application,vcluster,
sovereign} the primary clusterID's pods are sufficient and faster — no
fan-out cost.

PR B (mothership-side handover hook to POST each secondary kubeconfig)
will complete the chain. Until then, secondaries don't appear in
k8sCache.Clusters() so this fan-out is a no-op on existing provs — but
the code is in place for when PR B lands.

Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 08:07:26 +04:00
e3mrah
bcab6430cb
feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A) (#1579)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)

PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".

Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.

The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)

When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:

  1. Burning a Dynadot API credit on a flip that would be idempotent.
  2. The D30 blocker — current Dynadot creds return pdm-status-401
     even when the desired NS state already exists. Caught on t132
     2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
     parentDomains attempt.

Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.

This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses catalyst-system namespace

PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).

Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.

Verified the CRD shape on t134 2026-05-17:
  $ kubectl api-resources --api-group=access.openova.io
  useraccesses   access.openova.io/v1alpha1   true   UserAccess
                                                ^^^^
                                                NAMESPACED

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses tierRoleRef not wildcard app

PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.

The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.

Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)

D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.

Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.

This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
  depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
  (canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract

PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.

PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.

Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.

Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 08:06:08 +04:00
github-actions[bot]
6e329e27ae deploy: update catalyst images to 4f62dd2 2026-05-17 00:10:50 +00:00
e3mrah
4f62dd21b3
fix(handover): D21 owner seed uses tierRoleRef not wildcard app (#1578)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)

PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".

Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.

The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)

When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:

  1. Burning a Dynadot API credit on a flip that would be idempotent.
  2. The D30 blocker — current Dynadot creds return pdm-status-401
     even when the desired NS state already exists. Caught on t132
     2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
     parentDomains attempt.

Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.

This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses catalyst-system namespace

PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).

Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.

Verified the CRD shape on t134 2026-05-17:
  $ kubectl api-resources --api-group=access.openova.io
  useraccesses   access.openova.io/v1alpha1   true   UserAccess
                                                ^^^^
                                                NAMESPACED

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses tierRoleRef not wildcard app

PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.

The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.

Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 04:08:45 +04:00
github-actions[bot]
6466f97f6c deploy: update catalyst images to ea30ded 2026-05-16 23:28:04 +00:00
e3mrah
ea30ded120
fix(handover): D21 owner seed uses catalyst-system namespace (#1577)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)

PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".

Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.

The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)

When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:

  1. Burning a Dynadot API credit on a flip that would be idempotent.
  2. The D30 blocker — current Dynadot creds return pdm-status-401
     even when the desired NS state already exists. Caught on t132
     2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
     parentDomains attempt.

Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.

This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): D21 owner seed uses catalyst-system namespace

PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).

Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.

Verified the CRD shape on t134 2026-05-17:
  $ kubectl api-resources --api-group=access.openova.io
  useraccesses   access.openova.io/v1alpha1   true   UserAccess
                                                ^^^^
                                                NAMESPACED

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 03:26:06 +04:00
github-actions[bot]
18b5fa1466 deploy: update catalyst images to 33ed484 2026-05-16 23:24:34 +00:00
e3mrah
33ed484e04
fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) (#1576)
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)

PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".

Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.

The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)

When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:

  1. Burning a Dynadot API credit on a flip that would be idempotent.
  2. The D30 blocker — current Dynadot creds return pdm-status-401
     even when the desired NS state already exists. Caught on t132
     2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
     parentDomains attempt.

Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.

This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 03:21:42 +04:00
github-actions[bot]
a65a024114 deploy: update catalyst images to c148ec6 2026-05-16 22:33:19 +00:00
e3mrah
c148ec6a34
fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) (#1575)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".

Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.

The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 02:31:26 +04:00
github-actions[bot]
c5f777056f deploy: update catalyst images to 3568b72 2026-05-16 22:20:19 +00:00
e3mrah
3568b72b5e
fix(cloud): hide non-active 0/0 chips (D15) (#1574)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)

Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).

Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).

Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).

Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(slot-13): add D22 sovereign-side identity placeholders

Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.

This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)

Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.

Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).

SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.

Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b)

PR #1552 stripped the `/app` prefix on Sovereign mode to make
`/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match.
But consoleAppDetailRoute is registered at `/app/$componentId` under
consoleLayoutRoute — no chroot route matches `/<componentId>` directly,
so stripping leaves an empty render path. Playwright walkthrough on
t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render
body_len=9 (empty).

Invert the logic: only redirect mothership-only sub-paths (/dashboard
Fleet view, /install wizard, /sre, /sec, /blueprints) which have no
Sovereign Console equivalent. For everything else (component names like
`/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match
pick consoleAppDetailRoute / consoleAppsRoute.

Caught live on t132 via Playwright walker3.js — agent a4825c5a.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): re-mint handover JWT on every GetDeployment (D0)

D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL
persisted at handover-fire time carries a JWT that expires per
DefaultTTL (5min). Operators who click /jobs hours later get the stale
token → Sovereign-side /auth/handover rejects with raw JSON
{"error":"invalid token"} — no UI fallback, no /auth/handover-error,
auto-redirect to /dashboard never fires.

Re-mint the JWT on every GetDeployment when deployment is ready +
handover-fired so the URL returned to the wizard is always
freshly-signed.

Best-effort: on mint failure, leave the existing URL in place so a
transient signer error doesn't break polling. Helper is idempotent +
locked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cloud): hide non-active 0/0 chips (D15)

Playwright walkthrough on t132 2026-05-17 caught D15 PARTIAL: 15 chips
are correct but Bucket+Volume show 0/0. Founder rule (DoD D15):
"No kind chip shows 0/0 for a resource that actually exists in the
cluster". Bucket+Volume genuinely don't exist on this Sovereign so
showing 0/0 is noise.

Hide chips with count exactly 0 unless they're the active selection
(operator who navigated to an empty kind keeps context).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 02:18:24 +04:00
github-actions[bot]
44e612f39d deploy: update catalyst images to 58dbb92 2026-05-16 22:18:16 +00:00
e3mrah
58dbb92f4f
fix(handover): re-mint handover JWT on every GetDeployment (D0) (#1573)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)

Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).

Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).

Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).

Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(slot-13): add D22 sovereign-side identity placeholders

Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.

This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)

Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.

Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).

SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.

Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b)

PR #1552 stripped the `/app` prefix on Sovereign mode to make
`/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match.
But consoleAppDetailRoute is registered at `/app/$componentId` under
consoleLayoutRoute — no chroot route matches `/<componentId>` directly,
so stripping leaves an empty render path. Playwright walkthrough on
t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render
body_len=9 (empty).

Invert the logic: only redirect mothership-only sub-paths (/dashboard
Fleet view, /install wizard, /sre, /sec, /blueprints) which have no
Sovereign Console equivalent. For everything else (component names like
`/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match
pick consoleAppDetailRoute / consoleAppsRoute.

Caught live on t132 via Playwright walker3.js — agent a4825c5a.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(handover): re-mint handover JWT on every GetDeployment (D0)

D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL
persisted at handover-fire time carries a JWT that expires per
DefaultTTL (5min). Operators who click /jobs hours later get the stale
token → Sovereign-side /auth/handover rejects with raw JSON
{"error":"invalid token"} — no UI fallback, no /auth/handover-error,
auto-redirect to /dashboard never fires.

Re-mint the JWT on every GetDeployment when deployment is ready +
handover-fired so the URL returned to the wizard is always
freshly-signed.

Best-effort: on mint failure, leave the existing URL in place so a
transient signer error doesn't break polling. Helper is idempotent +
locked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 02:16:26 +04:00
github-actions[bot]
dea683f5e4 deploy: update catalyst images to 9e1e422 2026-05-16 22:08:01 +00:00
e3mrah
9e1e4224d8
fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) (#1572)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)

Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).

Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).

Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).

Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(slot-13): add D22 sovereign-side identity placeholders

Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.

This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)

Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.

Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).

SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.

Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b)

PR #1552 stripped the `/app` prefix on Sovereign mode to make
`/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match.
But consoleAppDetailRoute is registered at `/app/$componentId` under
consoleLayoutRoute — no chroot route matches `/<componentId>` directly,
so stripping leaves an empty render path. Playwright walkthrough on
t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render
body_len=9 (empty).

Invert the logic: only redirect mothership-only sub-paths (/dashboard
Fleet view, /install wizard, /sre, /sec, /blueprints) which have no
Sovereign Console equivalent. For everything else (component names like
`/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match
pick consoleAppDetailRoute / consoleAppsRoute.

Caught live on t132 via Playwright walker3.js — agent a4825c5a.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 02:05:54 +04:00
github-actions[bot]
4cc880cafd deploy: update catalyst images to 5793958 2026-05-16 21:48:54 +00:00
e3mrah
57939585c0
feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) (#1571)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)

Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).

Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).

Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).

Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(slot-13): add D22 sovereign-side identity placeholders

Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.

This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)

Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.

Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).

SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.

Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 01:47:04 +04:00
e3mrah
700d28967f
chore(slot-13): add D22 sovereign-side identity placeholders (#1570)
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)

Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).

Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).

Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).

Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(slot-13): add D22 sovereign-side identity placeholders

Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.

This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 01:29:59 +04:00
github-actions[bot]
df193d340e deploy: update catalyst images to 9cbcd23 2026-05-16 21:03:01 +00:00
e3mrah
9cbcd230da
feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) (#1569)
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).

Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).

Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).

Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 01:01:00 +04:00
github-actions[bot]
0e0280bbe0 deploy: update catalyst images to 6618392 2026-05-16 20:56:10 +00:00
e3mrah
6618392407
fix(chroot): GetDeployment falls back to chrootEnsureDeployment (D22) (#1568)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)

Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.

After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:

  apiVersion: access.openova.io/v1alpha1
  kind: UserAccess
  metadata:
    name: useraccess-owner-<sanitized-email>
    annotations:
      catalyst.openova.io/user-email: <email>   # rbac_matrix:309 hint
  spec:
    user:
      keycloakSubject: <email>
    sovereignRef: <fqdn-first-label>
    applications:
      - app: "*"
        role: admin                              # owner -> admin

The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.

Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.

Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).

Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21

* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)

PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5)

PR #1551 single-quoted SOVEREIGN_REGIONS_JSON in the slot file
substitute, but Flux Kustomize's postBuild can still re-parse the
JSON-shaped string as a YAML flow-sequence depending on quoting context.
When that happens .Values.sovereign.regionsJson is a Go []interface{}
of map[interface{}]interface{} and `| quote` prints Go's
`[map[cloudRegion:hel1 ...]]` syntax — catalyst-api's json.Unmarshal of
the env var then fails and Request.Regions is empty.

toJson normalises both string and list inputs to valid JSON.

Caught live on t132 2026-05-16 chart 1.4.147: env var rendered as
`[map[cloudRegion:hel1 ...]]` despite #1551 being in effect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chroot): populate deployment Result + Request fields for D22

Settings page on Sovereign Console renders `—` for Region / Sovereign /
Created / DeploymentID / Pool subdomain because chroot's GET
/api/v1/deployments/<id> returns empty strings for those fields.

Populate from existing env vars (best-effort — empty when chart hasn't
wired them yet, which is no worse than today's behaviour):
- Result.ConsoleURL = "https://console.<fqdn>" (derived from selfFQDN)
- Result.GitOpsRepoURL from GITOPS_REPO_URL env
- Result.ControlPlaneIP from SOVEREIGN_CONTROL_PLANE_IP env
- Request.Region = regions[0].CloudRegion (top-level legacy field)
- Request.OrgEmail from OPERATOR_EMAIL env
- Request.OrgName from ORG_NAME env

Companion chart PR will wire the env vars from .Values.global.* +
cloud-init substitute placeholders. This PR is BACKWARD-compatible —
unset env vars produce empty strings, same as today.

Caught live on t132 2026-05-16 — `curl /api/v1/deployments/sovereign-
t132.omani.works` returns empty ownerEmail/region/consoleURL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chroot): GetDeployment falls back to chrootEnsureDeployment (D22)

GetDeployment was the only handler that returned 404 without calling
chrootEnsureDeployment. After a catalyst-api Pod restart on the chroot
the in-memory store is empty until some other handler (StreamLogs,
jobs list) primes it via its own synth call — meanwhile the Sovereign
Console Settings page loads /api/v1/deployments/<id> first and gets
404, rendering the entire page broken.

Mirror the StreamLogs pattern (lines 1247-1254): try in-memory load,
fall through to chrootEnsureDeployment, return 404 only when both miss.

This unblocks PR #1567's deployment-record population — without the
fallback, GetDeployment can never serve the populated record on chroot.

Caught live on t132 2026-05-16 after #1567 image roll: Settings page
404 because in-memory store was empty.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 00:54:20 +04:00
github-actions[bot]
b094a354b7 deploy: update catalyst images to ed63ecd 2026-05-16 20:31:39 +00:00
e3mrah
ed63ecd09f
fix(chroot): populate deployment Result + Request fields for D22 settings (#1567)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)

Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.

After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:

  apiVersion: access.openova.io/v1alpha1
  kind: UserAccess
  metadata:
    name: useraccess-owner-<sanitized-email>
    annotations:
      catalyst.openova.io/user-email: <email>   # rbac_matrix:309 hint
  spec:
    user:
      keycloakSubject: <email>
    sovereignRef: <fqdn-first-label>
    applications:
      - app: "*"
        role: admin                              # owner -> admin

The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.

Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.

Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).

Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21

* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)

PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5)

PR #1551 single-quoted SOVEREIGN_REGIONS_JSON in the slot file
substitute, but Flux Kustomize's postBuild can still re-parse the
JSON-shaped string as a YAML flow-sequence depending on quoting context.
When that happens .Values.sovereign.regionsJson is a Go []interface{}
of map[interface{}]interface{} and `| quote` prints Go's
`[map[cloudRegion:hel1 ...]]` syntax — catalyst-api's json.Unmarshal of
the env var then fails and Request.Regions is empty.

toJson normalises both string and list inputs to valid JSON.

Caught live on t132 2026-05-16 chart 1.4.147: env var rendered as
`[map[cloudRegion:hel1 ...]]` despite #1551 being in effect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chroot): populate deployment Result + Request fields for D22

Settings page on Sovereign Console renders `—` for Region / Sovereign /
Created / DeploymentID / Pool subdomain because chroot's GET
/api/v1/deployments/<id> returns empty strings for those fields.

Populate from existing env vars (best-effort — empty when chart hasn't
wired them yet, which is no worse than today's behaviour):
- Result.ConsoleURL = "https://console.<fqdn>" (derived from selfFQDN)
- Result.GitOpsRepoURL from GITOPS_REPO_URL env
- Result.ControlPlaneIP from SOVEREIGN_CONTROL_PLANE_IP env
- Request.Region = regions[0].CloudRegion (top-level legacy field)
- Request.OrgEmail from OPERATOR_EMAIL env
- Request.OrgName from ORG_NAME env

Companion chart PR will wire the env vars from .Values.global.* +
cloud-init substitute placeholders. This PR is BACKWARD-compatible —
unset env vars produce empty strings, same as today.

Caught live on t132 2026-05-16 — `curl /api/v1/deployments/sovereign-
t132.omani.works` returns empty ownerEmail/region/consoleURL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 00:29:44 +04:00
github-actions[bot]
d82e06bfe9 deploy: update catalyst images to 0a45fb0 2026-05-16 20:03:41 +00:00
e3mrah
0a45fb0449
fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5) (#1566)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)

Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.

After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:

  apiVersion: access.openova.io/v1alpha1
  kind: UserAccess
  metadata:
    name: useraccess-owner-<sanitized-email>
    annotations:
      catalyst.openova.io/user-email: <email>   # rbac_matrix:309 hint
  spec:
    user:
      keycloakSubject: <email>
    sovereignRef: <fqdn-first-label>
    applications:
      - app: "*"
        role: admin                              # owner -> admin

The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.

Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.

Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).

Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21

* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)

PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chart): regionsJson uses toJson to defeat YAML flow-seq re-parse (D5)

PR #1551 single-quoted SOVEREIGN_REGIONS_JSON in the slot file
substitute, but Flux Kustomize's postBuild can still re-parse the
JSON-shaped string as a YAML flow-sequence depending on quoting context.
When that happens .Values.sovereign.regionsJson is a Go []interface{}
of map[interface{}]interface{} and `| quote` prints Go's
`[map[cloudRegion:hel1 ...]]` syntax — catalyst-api's json.Unmarshal of
the env var then fails and Request.Regions is empty.

toJson normalises both string and list inputs to valid JSON.

Caught live on t132 2026-05-16 chart 1.4.147: env var rendered as
`[map[cloudRegion:hel1 ...]]` despite #1551 being in effect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 00:01:43 +04:00
e3mrah
3f8e2b925e
chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked) (#1565)
* feat(handover): auto-seed owner UserAccess CR on chroot (D21)

Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.

After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:

  apiVersion: access.openova.io/v1alpha1
  kind: UserAccess
  metadata:
    name: useraccess-owner-<sanitized-email>
    annotations:
      catalyst.openova.io/user-email: <email>   # rbac_matrix:309 hint
  spec:
    user:
      keycloakSubject: <email>
    sovereignRef: <fqdn-first-label>
    applications:
      - app: "*"
        role: admin                              # owner -> admin

The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.

Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.

Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).

Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21

* chore(slot-13): pin bp-catalyst-platform to 1.4.147 (D21+D31 baked)

PR #1562 (D31 wordpress-tenant activeHotStandby) + PR #1564 (D21 owner
UserAccess auto-seed at handover, catalyst-api:8d2a947) both packaged
into chart 1.4.147. Pin slot so t133+ gets both gates on first prov.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:58:46 +04:00
github-actions[bot]
f8c8a87151 deploy: update catalyst images to 8d2a947 2026-05-16 19:51:40 +00:00
e3mrah
8d2a947cfb
feat(handover): auto-seed owner UserAccess CR on chroot (D21) (#1564)
Closes the D21 gap on Sovereign DoD: /users page returned empty after
fresh handover because Keycloak `sovereign-admins` membership was
established but no UserAccess CR existed for the operator.

After `keycloak.EnsureUser` succeeds in `AuthHandover`, the helper
`EnsureOwnerUserAccess` upserts a cluster-scoped UserAccess CR shaped
like the canonical user_access.go `CreateUserAccess` write:

  apiVersion: access.openova.io/v1alpha1
  kind: UserAccess
  metadata:
    name: useraccess-owner-<sanitized-email>
    annotations:
      catalyst.openova.io/user-email: <email>   # rbac_matrix:309 hint
  spec:
    user:
      keycloakSubject: <email>
    sovereignRef: <fqdn-first-label>
    applications:
      - app: "*"
        role: admin                              # owner -> admin

The Composition (issue #322) reconciles the Claim into per-app
RoleBindings on the Sovereign so the operator surfaces in /users.

Best-effort + idempotent: AlreadyExists on the second handover is
folded to nil; any other error is logged at Warn and the handover
itself never fails. If the access.openova.io CRD has not rolled yet,
the next handover retries automatically.

Architect-first: mirrors `userAccessToUnstructured` shape and uses
existing `sovereignDynamicClient` + `rbacAssignSlug` seams. Tier
mapping follows the documented lossy `owner -> admin` rule in
`userAccessTierToRole` (CRD only accepts admin|editor|viewer).

Refs: docs/SOVEREIGN-MULTI-REGION-DOD.md D21

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-16 23:49:32 +04:00
e3mrah
5510ab91f9
chore(slot-13): pin bp-catalyst-platform to 1.4.146 (D29 billing JWT bypass) (#1563)
PR #1561 added billing-service JWT exemptions matching gateway public
routes (D29 voucher-redeem zero-touch). Pin slot so future provisions
inherit the full D29 unblocker chain.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:41:34 +04:00
github-actions[bot]
d6b6aca581 deploy: update sme service images to c04b2ec + bump chart to 1.4.147 2026-05-16 19:41:18 +00:00
e3mrah
c04b2ec76d
feat(wordpress-tenant): activeHotStandby option wires bp-cnpg-pair (D31) (#1562)
Sovereign DoD D31 — tenants subscribing to an HA-capable marketplace app
may opt into a cross-region active-hot-standby Postgres pair for their
WordPress instance instead of the default single CNPG Cluster.

Mirrors the canonical bp-cnpg-pair pattern (primary + replica Cluster
CRs with WAL streaming over Cilium ClusterMesh via a managed Service
annotated service.cilium.io/global=true). When the new
pg.activeHotStandby.enabled flag is false (default), templates render
the existing single Cluster bit-for-bit — no regression for non-HA
tenants.

Catalog seed flags WordPress with ha + cnpg-pair tags so the marketplace
HA filter can surface it.

Chart bumped 0.2.1 -> 0.3.0. New render-gate test asserts both default
single-cluster shape AND the enabled 2-Cluster shape with the right
nodeSelectors, replica.source, externalCluster.host, Cilium global
annotation, and bootstrap.pg_basebackup; all 5 cases pass.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:39:29 +04:00
github-actions[bot]
af4d9b1b87 deploy: update sme service images to f9ed292 + bump chart to 1.4.146 2026-05-16 19:29:50 +00:00
e3mrah
f9ed292198
fix(billing): /redeem-preview + plans + addons bypass JWT (D29) (#1561)
* chore(slot-13): pin bp-catalyst-platform to 1.4.145 (D29 gateway public routes)

PR #1559 added /api/billing/{vouchers/redeem-preview,plans,addons} as
public gateway routes — required for the marketplace /redeem zero-touch
flow. Pin the slot so future provisions inherit it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(billing): /redeem-preview + plans + addons bypass JWT (D29)

Mirror PR #1559's gateway public routes in the billing service's own
middleware chain. The gateway now lets these requests through without
an Authorization header (D29 voucher-redeem landing), but billing
service's main.go was JWT-gating EVERY /billing/* path except
/billing/webhook — so the request still got 401, just one hop later.

Caught live on t132 2026-05-16 after PR #1559 rolled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:28:48 +04:00
e3mrah
936e76f79a
chore(slot-13): pin bp-catalyst-platform to 1.4.145 (D29 gateway public routes) (#1560)
PR #1559 added /api/billing/{vouchers/redeem-preview,plans,addons} as
public gateway routes — required for the marketplace /redeem zero-touch
flow. Pin the slot so future provisions inherit it.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:25:14 +04:00
github-actions[bot]
696aa26f83 deploy: update sme service images to a11067d + bump chart to 1.4.145 2026-05-16 19:18:09 +00:00