PR #1537 set `use-private-ip: "true"` on the clustermesh-apiserver Service annotations. CCM rejected with: ReconcileHCLBTargets: use private ip: missing network id The per-region Hetzner LB allocated by CCM has no private-network attachment by default (LB private_net is empty), so it can't route to the backend's private IP. Result: LB never allocated, clustermesh apiserver Service stays `<pending>`, orchestrator waits 5min and bails with empty peerEntries. Caught on t130 (30463cd0a5a931be, 2026-05-16). PR #1538's canonical fix opens TCP 30000-32767 in the Hetzner firewall so the public-IP LB→backend health checks pass. Revert use-private-ip to false so the chain works end-to-end. Refs DoD D11. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
306 lines
16 KiB
YAML
306 lines
16 KiB
YAML
# bp-cilium — Catalyst bootstrap-kit Blueprint. CNI must come first; k3s started with --flannel-backend=none precisely so Cilium can take over.
|
|
#
|
|
# Wrapper chart: platform/cilium/chart/
|
|
# Catalyst-curated values: platform/cilium/chart/values.yaml
|
|
# Reconciled by: Flux on the new Sovereign's k3s control plane.
|
|
|
|
---
|
|
# kube-system is built into every Kubernetes cluster — never re-declare it.
|
|
# Earlier revisions of 01-cilium.yaml AND 05-sealed-secrets.yaml both
|
|
# declared it, which collided when kustomize tried to merge the two:
|
|
# "may not add resource with an already registered id:
|
|
# Namespace.v1.[noGrp]/kube-system.[noNs]"
|
|
# This Blueprint installs Cilium INTO kube-system; the HelmRelease's
|
|
# targetNamespace field below is sufficient.
|
|
apiVersion: source.toolkit.fluxcd.io/v1beta2
|
|
kind: HelmRepository
|
|
metadata:
|
|
name: bp-cilium
|
|
namespace: flux-system
|
|
spec:
|
|
type: oci
|
|
interval: 15m
|
|
url: oci://ghcr.io/openova-io
|
|
secretRef:
|
|
name: ghcr-pull
|
|
---
|
|
apiVersion: helm.toolkit.fluxcd.io/v2
|
|
kind: HelmRelease
|
|
metadata:
|
|
name: bp-cilium
|
|
namespace: flux-system
|
|
labels:
|
|
catalyst.openova.io/slot: "01"
|
|
spec:
|
|
interval: 15m
|
|
releaseName: cilium
|
|
targetNamespace: kube-system
|
|
chart:
|
|
spec:
|
|
chart: bp-cilium
|
|
# 1.3.4 (prov #55, 2026-05-12): flip kubeProxyReplacement false→true
|
|
# in chart defaults so the BPF masquerade datapath (bpf.masquerade:
|
|
# true, already on by default) gets the NodePort it needs at startup.
|
|
# Worker cilium-agent on prov 8d85a64cb8807cdc crashloop'd with
|
|
# "BPF masquerade requires NodePort" → node.cilium.io/agent-not-ready
|
|
# taint persisted → every post-install Job pod (keycloak-config-cli,
|
|
# powerdns, mimir, openbao) stayed Pending → bootstrap-kit chain
|
|
# stalled. Aligns with the cloud-init pre-Flux Cilium install which
|
|
# already used kubeProxyReplacement: true.
|
|
# 1.3.3 (qa-loop iter-16 Fix #70): Hubble UI HTTPRoute defaults
|
|
# corrected — gatewayRef.namespace=kube-system (was the stale
|
|
# cilium-gateway), serviceRef.namespace=kube-system (was the stale
|
|
# cilium), plus chart auto-derives hubble.<sovereignFQDN> when only
|
|
# SOVEREIGN_FQDN is provided. Combined with the bootstrap-kit
|
|
# default flip below (HUBBLE_ENABLED=true, hubble.relay/ui enabled,
|
|
# SOVEREIGN_FQDN forwarded), every Sovereign exposes Hubble UI at
|
|
# https://hubble.<sovereignFQDN>/ out of the box. TC-289 NXDOMAIN
|
|
# is closed because external-dns now sees the HTTPRoute hostname
|
|
# and writes the A record into PowerDNS.
|
|
# 1.3.1 (qa-loop iter-12 Fix #54 Workstream 2): bpf.preallocateMaps=true
|
|
# + socketLB.hostNamespaceOnly=true defaults so fresh worker pods can
|
|
# resolve DNS reliably on first-join (cilium/cilium#28456 mitigation).
|
|
# 1.3.0 (qa-loop iter-12 Fix #53C+D): adds the Hubble UI HTTPRoute
|
|
# overlay (slice H7 #1095) that the catalystOverlay.hubbleUI block
|
|
# below depends on.
|
|
version: 1.3.5
|
|
sourceRef:
|
|
kind: HelmRepository
|
|
name: bp-cilium
|
|
namespace: flux-system
|
|
# Event-driven install: Helm completes when manifests apply, not when
|
|
# cilium-agent reaches Ready (agent waits for envoyconfig CRDs that the
|
|
# SAME chart installs — legitimate slow-Ready). Replaces blanket
|
|
# spec.timeout: 15m band-aid from PR #221.
|
|
install:
|
|
timeout: 15m
|
|
disableWait: true
|
|
remediation:
|
|
retries: 3
|
|
upgrade:
|
|
timeout: 15m
|
|
disableWait: true
|
|
remediation:
|
|
retries: 3
|
|
values:
|
|
cilium:
|
|
# Multi-region (operator mandate 2026-05-12) — each region's k3s
|
|
# is an INDEPENDENT cluster per NAMING-CONVENTION §1.3, so each
|
|
# region's cilium MUST talk to its OWN local CP, not the primary's
|
|
# 10.0.1.2. Flux postBuild.substitute in
|
|
# cloudinit-control-plane.tftpl renders CILIUM_K8S_SERVICE_HOST to
|
|
# the local CP's private IP per region (10.0.1.2 for primary,
|
|
# 10.0.<10+idx>.2 for secondaries — see main.tf:267
|
|
# secondary_region_cp_ips). Without this, secondary regions'
|
|
# cilium-operator crash-loops with x509 unknown authority (the
|
|
# primary's CA doesn't sign the secondary cluster's API cert).
|
|
# The :=10.0.1.2 fallback preserves single-region (primary-only)
|
|
# provisions where the substitute var would be empty/absent.
|
|
k8sServiceHost: ${CILIUM_K8S_SERVICE_HOST:=10.0.1.2}
|
|
# Phase-8a bug #15 (otech8 deployment 1bfc46347564467b 2026-05-01):
|
|
# cilium-agent waits forever for the operator to register
|
|
# ciliumenvoyconfigs + ciliumclusterwideenvoyconfigs CRDs.
|
|
# Setting `envoy.enabled: true` (chart-level) runs Envoy as a separate
|
|
# daemonset but does NOT register those CRDs — that requires
|
|
# `envoyConfig.enabled: true`, a separate upstream chart toggle.
|
|
# Without it, the agent's node taint `node.cilium.io/agent-not-ready`
|
|
# never lifts and every other HelmRelease (37 of them) blocks on its
|
|
# dependsOn chain.
|
|
envoyConfig:
|
|
enabled: true
|
|
l7Proxy: true
|
|
prometheus:
|
|
enabled: false
|
|
serviceMonitor:
|
|
enabled: false
|
|
hubble:
|
|
metrics:
|
|
# `null` (NOT [] and NOT a populated list) suppresses the
|
|
# upstream chart's metrics ServiceMonitor render. Hubble flow
|
|
# collection still works for Hubble Relay/UI without a
|
|
# ServiceMonitor — that pulls the kube-prometheus-stack CRDs
|
|
# which do not exist on a fresh Sovereign at bp-cilium install
|
|
# time. Operators flip metrics on once
|
|
# bp-kube-prometheus-stack is reconciled (issue #182).
|
|
enabled: null
|
|
serviceMonitor:
|
|
enabled: false
|
|
# qa-loop iter-16 Fix #70: default Hubble UI ON for every
|
|
# Sovereign so TC-289 (https://hubble.<fqdn>/) resolves out of
|
|
# the box. Hubble flow telemetry is the canonical L3-L7 visibility
|
|
# surface for the Catalyst control plane (EPIC-5 #1100); shipping
|
|
# it default-OFF made every Sovereign blind by default and
|
|
# required a per-Sovereign overlay touch that nobody remembered
|
|
# to wire. Per-Sovereign overlay can still set HUBBLE_ENABLED=false
|
|
# for fully air-gapped lab Sovereigns where Relay traffic to
|
|
# cilium-agents is not desired.
|
|
relay:
|
|
enabled: ${HUBBLE_ENABLED:=true}
|
|
ui:
|
|
enabled: ${HUBBLE_ENABLED:=true}
|
|
|
|
# qa-loop iter-12 Fix #53C: BGP control plane (default off, opt-in
|
|
# via BGP_ENABLED=true). Per ADR-0001 §9 the BGP control plane is the
|
|
# canonical path for Sovereign-to-customer-router prefix advertisement
|
|
# (LoadBalancer VIPs, Pod CIDRs to customer's existing core network).
|
|
bgpControlPlane:
|
|
enabled: ${BGP_ENABLED:=false}
|
|
|
|
# ── Cilium ClusterMesh — multi-region peering ──────────────────
|
|
#
|
|
# Per ADR-0001 §9 + EPIC-6 #1101 (multi-region active-hotstandby DR),
|
|
# ClusterMesh is the canonical inter-region transport for replication
|
|
# and Service-of-type-global traffic between Sovereign peer clusters.
|
|
#
|
|
# cluster.name + cluster.id are PER-SOVEREIGN anchors; per
|
|
# docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode), they MUST come
|
|
# from the Flux Kustomization's postBuild.substitute block — which
|
|
# in turn flows from infra/hetzner/cloudinit-control-plane.tftpl
|
|
# (CLUSTER_MESH_NAME, CLUSTER_MESH_ID) and ultimately from the
|
|
# operator-supplied request.cluster_mesh_name + cluster_mesh_id at
|
|
# provision time. Mesh registry: docs/CLUSTERMESH-CLUSTER-IDS.md
|
|
# tracks the cluster.id allocation across the OpenOva fleet.
|
|
#
|
|
# NodePort 32379: clustermesh-apiserver Pod is exposed on every
|
|
# Cilium node so peers reach it over the Hetzner private network on
|
|
# `<cp-private-ip>:32379` WITHOUT requiring a Hetzner LoadBalancer
|
|
# per peer (LB count is project-quota'd). Hetzner firewall must
|
|
# open 32379/tcp from peer Sovereigns' Hetzner CIDRs.
|
|
#
|
|
# A Sovereign that does NOT join a mesh leaves CLUSTER_MESH_NAME
|
|
# empty (Flux envsubst rule: ${VAR:=default} -> "default" when
|
|
# unset/empty). The cilium subchart accepts an empty cluster.name
|
|
# provided cluster.id stays 0; the clustermesh-apiserver Pod still
|
|
# runs but no peer connects (single-cluster no-op).
|
|
cluster:
|
|
name: ${CLUSTER_MESH_NAME:=}
|
|
id: ${CLUSTER_MESH_ID:=0}
|
|
clustermesh:
|
|
useAPIServer: true
|
|
apiserver:
|
|
service:
|
|
# 2026-05-15: default flipped NodePort → LoadBalancer per DoD A3
|
|
# (docs/SOVEREIGN-MULTI-REGION-DOD.md). Founder ruling:
|
|
# "ClusterMesh apiserver Service = LoadBalancer (NEVER NodePort)".
|
|
#
|
|
# On Hetzner, hcloud-ccm allocates a public-IPv4 LB per peer
|
|
# region; AutoEstablishClusterMesh (handler/clustermesh.go,
|
|
# PR #1508) hard-fails on type != LoadBalancer and reads the
|
|
# LB ingress IP for the peer endpoint. Cilium WG node
|
|
# encryption secures the LB→node→pod path end-to-end.
|
|
#
|
|
# ${CLUSTERMESH_SERVICE_TYPE:=LoadBalancer} keeps the
|
|
# operator escape hatch (e.g. bare-metal Sovereigns with
|
|
# MetalLB or non-cloud peers can override to NodePort) but
|
|
# the cloud-Hetzner default is now A3-compliant out of the
|
|
# box.
|
|
type: ${CLUSTERMESH_SERVICE_TYPE:=LoadBalancer}
|
|
# Hetzner CCM requires location OR network-zone annotation
|
|
# to allocate the LB. ${HCLOUD_LB_LOCATION} flows from the
|
|
# bootstrap-kit Kustomization substitute, set by the
|
|
# cloud-init template for EVERY region (primary CP renders
|
|
# var.region; secondary CPs render each.value.cloudRegion).
|
|
# No default fallback: a missing substitute is a tofu
|
|
# rendering bug, not a runtime fallback opportunity. The
|
|
# previous `:=hel1` default silently masked the 2026-05-16
|
|
# multi-region rendering regression (t114-omani-works
|
|
# primary=hel1 — fallback APPEARED correct but every
|
|
# secondary also rendered hel1; an explicit empty render
|
|
# would have failed cilium chart admission and surfaced
|
|
# the bug at provision time instead of at clustermesh-
|
|
# apiserver LB allocation time).
|
|
annotations:
|
|
load-balancer.hetzner.cloud/location: "${HCLOUD_LB_LOCATION}"
|
|
load-balancer.hetzner.cloud/type: "lb11"
|
|
# use-private-ip: false — LB→backend connection transits
|
|
# the PUBLIC IP. PR #1537 had set this to "true" attempting
|
|
# to bypass the firewall NodePort block; that approach was
|
|
# NOT viable because the per-region Hetzner LB has no
|
|
# private-network attachment by default. CCM rejected:
|
|
# "ReconcileHCLBTargets: use private ip: missing network id"
|
|
# → LB never allocated → clustermesh apiserver Service
|
|
# stayed `<pending>` → clustermesh orchestrator waited 5min
|
|
# for LB IP then bailed with empty peerEntries.
|
|
#
|
|
# PR #1538's canonical fix opens TCP 30000-32767 in the
|
|
# Hetzner firewall so the public-IP LB health checks pass.
|
|
# This file reverts to use-private-ip=false to align with
|
|
# that approach. Caught on t130 (30463cd0a5a931be, 2026-05-16).
|
|
load-balancer.hetzner.cloud/use-private-ip: "false"
|
|
# 2026-05-16: per-region LB name suffix. Without
|
|
# ${SOVEREIGN_REGION_KEY} interpolated, all 3 regions'
|
|
# clustermesh-apiserver Services adopted the FIRST LB
|
|
# CCM-created (Hetzner LBs are unique by name; second
|
|
# creation just reuses the first). Caught on t121
|
|
# (48d8fe77...): primary + nbg1 both reported external_ip
|
|
# 167.233.14.208 (nbg1 LB), sin stayed <pending>.
|
|
load-balancer.hetzner.cloud/name: "${SOVEREIGN_FQDN_SLUG:=catalyst}-${SOVEREIGN_REGION_KEY:=primary}-clustermesh"
|
|
|
|
# ── Catalyst overlay templates (chart/templates/) ────────────────────
|
|
# qa-loop iter-16 Fix #70: Hubble UI HTTPRoute now defaults ON for
|
|
# every Sovereign. The chart auto-derives hostname `hubble.${SOVEREIGN_FQDN}`
|
|
# so the operator only needs the SOVEREIGN_FQDN substitute (already
|
|
# mandatory for every Sovereign — see clusters/_template/bootstrap-kit/
|
|
# 13-bp-catalyst-platform.yaml `host: console.${SOVEREIGN_FQDN}`).
|
|
# Per-Sovereign overlay can still:
|
|
# - HUBBLE_ENABLED=false → disable Hubble UI on this Sovereign
|
|
# - HUBBLE_HOSTNAME=... → override the auto-derived hostname
|
|
# - HUBBLE_AUTH=oidc → enable OIDC enforcement once the
|
|
# Keycloak realm wires the hubble-ui client
|
|
catalystOverlay:
|
|
hubbleUI:
|
|
enabled: ${HUBBLE_ENABLED:=true}
|
|
# Explicit override; empty triggers the chart to derive
|
|
# `hubble.${SOVEREIGN_FQDN}` from sovereignFQDN below.
|
|
hostname: ${HUBBLE_HOSTNAME:=}
|
|
sovereignFQDN: ${SOVEREIGN_FQDN}
|
|
gatewayRef:
|
|
# The Sovereign Gateway lives in kube-system — installed by
|
|
# clusters/_template/sovereign-tls/cilium-gateway.yaml. Every
|
|
# other bootstrap-kit HTTPRoute (gitea, auth, grafana, harbor,
|
|
# openbao, powerdns, console/catalyst-platform) attaches to
|
|
# cilium-gateway/kube-system; this overlay matches.
|
|
name: cilium-gateway
|
|
namespace: kube-system
|
|
# `none` until the Keycloak `hubble-ui` OIDC client is wired by
|
|
# bp-keycloak realm-config; flip to `oidc` per per-Sovereign
|
|
# overlay once that lands. Until then Hubble UI is publicly
|
|
# reachable — acceptable for the in-progress qa-loop iter-16
|
|
# observability slice; lock down before production handover via
|
|
# HUBBLE_AUTH=oidc.
|
|
auth: ${HUBBLE_AUTH:=none}
|
|
serviceRef:
|
|
name: hubble-ui
|
|
namespace: kube-system
|
|
port: 80
|
|
---
|
|
# ─── Per-Sovereign Gateway API resources (issue #387) ────────────────────
|
|
#
|
|
# Cilium owns the GatewayClass (`cilium`) installed by the chart above
|
|
# (gatewayAPI.enabled=true, envoy.enabled=true in platform/cilium/chart/
|
|
# values.yaml). The single per-Sovereign Gateway listening on
|
|
# *.${SOVEREIGN_FQDN}:443 lives here so it boots alongside the CNI
|
|
# without needing a new bootstrap-kit slot — every Sovereign HTTP
|
|
# blueprint (catalyst-platform, gitea, keycloak, harbor, grafana,
|
|
# openbao, powerdns) attaches its HTTPRoute to this Gateway via
|
|
# parentRefs.
|
|
#
|
|
# TLS material: a wildcard Certificate is requested from
|
|
# letsencrypt-dns01-prod-powerdns (cert-manager + bp-cert-manager-
|
|
# powerdns-webhook from #373; webhook calls contabo's central PowerDNS
|
|
# at https://pdns.openova.io). The resulting Secret
|
|
# `sovereign-wildcard-tls` is referenced by the Gateway listener.
|
|
#
|
|
# Cross-namespace HTTPRoute attachment: allowedRoutes.namespaces.from=All
|
|
# permits every blueprint namespace (catalyst-system, gitea, keycloak,
|
|
# harbor, grafana-system, openbao, powerdns-system) to bind without a
|
|
# ReferenceGrant. This matches the Catalyst single-tenant Sovereign
|
|
# model — cross-tenant isolation is enforced by per-tenant vClusters
|
|
# (bp-vcluster), not by Gateway-level RBAC.
|
|
#
|
|
# Per ADR-0001 §9.4 and docs/INVIOLABLE-PRINCIPLES.md #4: this resource
|
|
# only renders when ${SOVEREIGN_FQDN} is set by Flux envsubst at the
|
|
# Sovereign apply time — contabo's bootstrap path does NOT include this
|
|
# template, so Traefik continues to serve console.openova.io/nova
|
|
# unchanged.
|