openova/products/catalyst/bootstrap
e3mrah d0fd32dc04
fix(clustermesh): use peer's clustermesh-apiserver-remote-cert (D11) (#1539)
The orchestrator was minting a fresh client cert (CN = local cluster
name) for each peer connection. Even with PR #1530's "sign with
peer's CA" fix the TLS handshake succeeded but etcd RBAC rejected:

    error="etcdserver: permission denied"

Cilium's clustermesh-apiserver etcd has RBAC with a `remote` user
that has read access on the cilium/* prefix. The chart generates
`kube-system/clustermesh-apiserver-remote-cert` with CN=`remote`.

Canonical `cilium clustermesh connect` CLI copies THIS Secret's
tls.crt/tls.key as the client cert the REMOTE cluster presents —
matches the etcd RBAC user verbatim.

This PR adopts that pattern: snapshotRemoteCert() reads the peer's
existing `clustermesh-apiserver-remote-cert` Secret, returns
tls.crt + tls.key bytes, and the orchestrator writes them into
A's `cilium-clustermesh` Secret instead of minting.

Caught on t129 (6cddff7ef4432bdc, 2026-05-16):
- TLS handshake succeeded after firewall fix (PR #1538) opened
  NodePort range so LB→backend health check passed
- cilium-dbg status reported `etcd: 1/1 connected, has-quorum=true`
  (TLS path working)
- BUT `remote configuration: expected=true, retrieved=false` and
  agent logs spammed `etcdserver: permission denied`

With this PR's CN=remote cert, etcd authorizes the kvstore List
and clustermesh sync completes — agent should flip to
`2/2 remote clusters ready`.

Completes the D11 chain: #1525 (regionKeyFromSpec) → #1528
(clusterName derivation) → #1530 (cert with peer's CA — no longer
needed but kept as defense-in-depth) → #1536 (hostAlias pattern)
→ #1538 (firewall NodePort range) → this.

Refs DoD D11.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 18:58:22 +04:00
..
api fix(clustermesh): use peer's clustermesh-apiserver-remote-cert (D11) (#1539) 2026-05-16 18:58:22 +04:00
ui fix(sovereign-ui): derive synthetic Apps/Handover stage status from deployment record + auto-redirect after handover (#1522) 2026-05-16 14:56:16 +04:00