openova/products/catalyst/bootstrap
e3mrah 9fc2850504
fix(catalyst-api): D16/D17 — 3 bugs caught on t138 fresh prov (#1583)
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)

PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:

  internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
  internal/handler/deployment_handover_export.go:199:6: other declaration of itoa

Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-api): D16/D17 — 3 bugs caught on t138

Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.

1. exportSecondaryKubeconfigsToChild was guarded behind the early
   return of exportDeploymentToChild's failed POST. The child's
   ingress + cert + gateway are still racing to reach reachable
   state in the seconds after handover fires, so the first POST
   gets EOF and the goroutine never fires. Fix: kick off the
   D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
   in its own goroutine, BEFORE the deployment-record POST.

2. Both exports now retry with exponential backoff (5s → 60s) for
   up to 5 min total. Most handovers will succeed on attempt 2-4.
   Was: no retry, single shot, silent failure.

3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
   auth group (rg) into the top-level router (r), alongside
   /api/v1/internal/deployments/import. The previous registration
   required an operator session that doesn't exist at handover —
   mothership POSTs were 401'd silently. Validation is now via
   safeIDPattern regex on depID + regionKey (same security model
   as the deployments/import companion endpoint).

4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
   instead of using only the in-cluster client. Adds Cluster
   field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
   group/filter by region. Without this, /cloud?view=list&kind=nodes
   shows 1 node even when 3 secondary kubeconfigs are registered.

Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)

Refs: feedback_test_theater_3rd_violation_2026_05_17.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 09:26:16 +04:00
..
api fix(catalyst-api): D16/D17 — 3 bugs caught on t138 fresh prov (#1583) 2026-05-17 09:26:16 +04:00
ui fix(cloud): hide non-active 0/0 chips (D15) (#1574) 2026-05-17 02:18:24 +04:00