* fix(bootstrap-kit): remove bp-hcloud-csi slot 17a — chicken-and-egg with harbor Family G (PR #1601) added bp-hcloud-csi at bootstrap-kit slot 17a to ship the `hcloud-volumes` default StorageClass for C9-006. Caught live on t11 fresh prov 2026-05-17: - Flux source-controller chart pull went through harbor.t11.<sov> OCI endpoint BEFORE harbor itself was reachable on the network. - Chicken-and-egg: harbor depends on Gateway. Gateway lives in `sovereign-tls` Kustomization which dependsOn bootstrap-kit Ready. bp-hcloud-csi blocked bootstrap-kit Ready → sovereign-tls never applied → no Gateway CR → console.t11.<sov> ERR_CONNECTION_CLOSED. - Entire UI test matrix on t11 was BLOCKED on the missing Gateway (5 test agents reported the same root cause). C9-006 (hcloud-volumes default SC) is a cosmetic operator-facing improvement; Gateway availability is launch-critical. Removing slot 17a unblocks the chain. Follow-up PR will re-add at a later slot (e.g., 19a AFTER bp-harbor 19) OR fix the pull path to bypass the registry pivot during bootstrap. Also bumps chart 1.4.155 → 1.4.156 + bootstrap-kit pin per the chart-bump-needs-both rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(bootstrap-kit): also drop 17a-bp-hcloud-csi from kustomization.yaml resources list Companion commit to b96d8c50 — the prior commit only removed the file itself; this commit removes the resources: list entry that referenced it (otherwise Kustomize fails the dry-run with 'no such file'). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
133 lines
6.4 KiB
YAML
133 lines
6.4 KiB
YAML
apiVersion: kustomize.config.k8s.io/v1beta1
|
|
kind: Kustomization
|
|
|
|
# Order is documented but not enforced here — Flux respects HelmRelease
|
|
# dependsOn declarations for actual install order. Listing in canonical
|
|
# Phase 0 sequence per SOVEREIGN-PROVISIONING.md §3.
|
|
resources:
|
|
- 01-cilium.yaml
|
|
- 01a-gateway-api.yaml
|
|
- 02-cert-manager.yaml
|
|
- 03-flux.yaml
|
|
- 04-crossplane.yaml
|
|
- 05-sealed-secrets.yaml
|
|
- 05a-reflector.yaml
|
|
- 07-nats-jetstream.yaml
|
|
- 08-openbao.yaml
|
|
- 09-keycloak.yaml
|
|
- 10-gitea.yaml
|
|
- 11-powerdns.yaml
|
|
- 12-external-dns.yaml
|
|
- 13-bp-catalyst-platform.yaml
|
|
- 14-crossplane-claims.yaml
|
|
- 15-external-secrets.yaml
|
|
- 15a-external-secrets-stores.yaml
|
|
- 16-cnpg.yaml
|
|
- 17-valkey.yaml
|
|
# bp-hcloud-csi (formerly slot 17a) REMOVED 2026-05-17 (Wave 7):
|
|
# the Flux source-controller chart pull went through harbor.t11.* OCI
|
|
# endpoint BEFORE harbor itself was reachable (chicken-and-egg —
|
|
# harbor depends on Gateway, Gateway lives in sovereign-tls which
|
|
# dependsOn bootstrap-kit Ready, which never went Ready because
|
|
# bp-hcloud-csi was stuck on harbor pull). Caught live on t11 fresh
|
|
# prov 2026-05-17: bootstrap-kit Reconciliation-in-progress for 30+
|
|
# min → sovereign-tls "not ready: dependency bootstrap-kit not ready"
|
|
# → no Gateway CR → console.t11.<sov> ERR_CONNECTION_CLOSED →
|
|
# entire UI test matrix BLOCKED. C9-006 (hcloud-volumes default SC)
|
|
# is a cosmetic operator-facing nice-to-have; Gateway availability
|
|
# is launch-critical. Removing this slot unblocks the chain. Follow-
|
|
# up PR will re-add at a later slot (e.g., 19a, AFTER bp-harbor 19)
|
|
# OR fix the pull path to bypass the registry pivot during bootstrap.
|
|
- 18-seaweedfs.yaml
|
|
- 19-harbor.yaml
|
|
# 06a — Post-handover Self-Sovereignty Cutover (issue #791). Filename
|
|
# carries the 06a prefix to colocate cohorts visually, but the slot's
|
|
# dependsOn pins actual install order to AFTER bp-gitea (slot 10) and
|
|
# bp-harbor (slot 19). Chart installs DORMANT — catalyst-api stamps
|
|
# Jobs only on operator-driven cutover trigger.
|
|
- 06a-bp-self-sovereign-cutover.yaml
|
|
- 20-opentelemetry.yaml
|
|
- 21-alloy.yaml
|
|
- 22-loki.yaml
|
|
- 23-mimir.yaml
|
|
- 24-tempo.yaml
|
|
- 25-grafana.yaml
|
|
- 27-kyverno.yaml
|
|
- 28-reloader.yaml
|
|
- 29-vpa.yaml
|
|
- 30-trivy.yaml
|
|
- 31-falco.yaml
|
|
- 32-sigstore.yaml
|
|
- 33-syft-grype.yaml
|
|
- 34-velero.yaml
|
|
- 35-coraza.yaml
|
|
- 49-bp-cert-manager-powerdns-webhook.yaml
|
|
- 50-cluster-autoscaler.yaml
|
|
# qa-loop iter-7 Fix #39 — exec fan-out (Apache Guacamole + per-node
|
|
# k8s-ws-proxy DaemonSet). Slots 51/52. Slots 36-48 reserved for the
|
|
# W2.K4 AI-runtime cohort (bp-stunner / bp-knative / bp-kserve / vllm
|
|
# / bp-llm-gateway / etc.) — see scripts/expected-bootstrap-deps.yaml.
|
|
# The k8s-ws-proxy is the apiserver-side proxy; bp-guacamole is the
|
|
# operator-facing browser gateway that mounts it via the chart's
|
|
# NetworkPolicy egress rule. Both are dependsOn-ordered so Flux
|
|
# installs proxy → gateway.
|
|
- 51-bp-k8s-ws-proxy.yaml
|
|
- 52-bp-guacamole.yaml
|
|
# qa-loop iter-12 Fix #53C — EPIC-5 leftovers (NetBird zero-trust mesh
|
|
# + DMZ vCluster isolation). Slots 53/54. Both default-OFF; flip on
|
|
# via NETBIRD_ENABLED=true / DMZ_VCLUSTER_ENABLED=true on the
|
|
# bootstrap-kit Kustomization substitute.
|
|
#
|
|
# Slot 54 (bp-dmz-vcluster) implements docs/SOVEREIGN-MULTI-REGION-
|
|
# DOD.md A4 ("each region runs a DMZ vCluster") + A2 ("inter-region
|
|
# link = DMZ WireGuard over PUBLIC IPs"). Default-ON because the DMZ
|
|
# vCluster is the public-fronted vCluster AND the inter-region WG
|
|
# hop — every region needs it for the topology to converge.
|
|
- 54-bp-dmz-vcluster.yaml
|
|
# qa-loop iter-12 Fix #54 Workstream 1 — bp-hcloud-ccm (slot 55).
|
|
# Hetzner Cloud Controller Manager. The CCM owns node providerID
|
|
# flips (k3s://… → hcloud://<server-id>) AND materialisation of
|
|
# Service-of-type-LoadBalancer as Hetzner Cloud LBs. Without this,
|
|
# every LB-typed Service stays Pending — the proximate root cause
|
|
# clustermesh-apiserver could not migrate from NodePort to LB on
|
|
# omantel multi-region (qa-loop iter-12 Fix #53D).
|
|
- 55-bp-hcloud-ccm.yaml
|
|
# OpenovaFlow observability cohort — slots 56/57. Three-agent split
|
|
# (Agent #1: TS @openova/flow-core + @openova/flow-canvas, Agent #2:
|
|
# Go server + flux adapter, Agent #3: bootstrap-kit + catalyst-api
|
|
# proxy integration). Slot 56 (server) installs on PRIMARY clusters
|
|
# only; per-Sovereign overlay disables on secondaries. Slot 57
|
|
# (emitter) is a DaemonSet — runs on every cluster (mother + every
|
|
# Sovereign + every secondary region) so each region's Flux events
|
|
# land in the same per-deployment flow.
|
|
- 56-bp-openova-flow-server.yaml
|
|
- 57-bp-openova-flow-emitter.yaml
|
|
# DoD A4 vCluster topology (2026-05-16) — slots 58 + 59 finish the
|
|
# primary-mgmt + secondary-rtz pair that goes alongside the slot 54
|
|
# DMZ vCluster (every region). Combined topology per region:
|
|
# primary region → MGMT (58) + DMZ (54) vCluster
|
|
# secondary region → DMZ (54) + RTZ (59) vCluster
|
|
# Slot 58 default-OFF until the per-CP postBuild substitute follow-up
|
|
# PR adds MGMT_VCLUSTER_ENABLED only on primary. Slot 59 same shape
|
|
# for secondaries via RTZ_VCLUSTER_ENABLED. See each slot's header
|
|
# comment for the migration plan.
|
|
- 58-bp-mgmt-vcluster.yaml
|
|
- 59-bp-rtz-vcluster.yaml
|
|
# bp-newapi (slot 80) — multi-tenant LLM marketplace gateway. Sequenced
|
|
# after the W2.K1 dependency wave (cnpg/keycloak/openbao Ready) so
|
|
# NewAPI's ExternalSecret + DSN dependencies resolve on first reconcile.
|
|
# See clusters/_template/bootstrap-kit/80-newapi.yaml for full
|
|
# dependsOn rationale and per-Sovereign override surface.
|
|
- 80-newapi.yaml
|
|
# bp-stalwart-sovereign (slot 95) — REMOVED 2026-05-05.
|
|
# Phase-2 Sovereign-local mail (per-Sovereign Stalwart for Console
|
|
# PIN/magic-link delivery, umbrella #924) is OUT OF SCOPE for the
|
|
# current Phase-1 cutover. The Phase-1 design is mothership SMTP
|
|
# relay (mail.openova.io:587) — see products/catalyst/chart/values.yaml
|
|
# `sovereign.smtp.*` and the catalyst-api `sovereign_smtp_seed.go`
|
|
# path. The chart's post-install Job was timing out on otech113 and
|
|
# blocking the bootstrap-kit Kustomization. Re-introduce this slot
|
|
# only when Phase-2 is explicitly in scope and the chart's readiness
|
|
# gate is reliable. See platform/stalwart-sovereign/ for the chart
|
|
# itself (kept in-tree for future Phase-2 work).
|