openova/platform/powerdns/chart/Chart.yaml
e3mrah ce76a7b7ab
fix(bp-powerdns): root-cause Job DeadlineExceeded recurrence (post Fix #144) (#1425)
Fix #144 raised zoneBootstrap.activeDeadlineSeconds 300s → 840s after
prov #22 hit a 5m DeadlineExceeded on the bp-powerdns post-install hook.
That fix was insufficient: prov #37 + #38 (chroot omantel.biz, 2026-05-12)
both wedged on the SAME chart slot with `BackoffLimitExceeded`, NOT
`DeadlineExceeded`. The deadline never got a chance to fire.

Trace from prov #38 chroot (`KUBECONFIG=/tmp/prov38.kubeconfig kubectl
get hr bp-powerdns -o yaml`):

  status:
    Helm install failed for release powerdns/powerdns with chart
    bp-powerdns@1.2.2: failed post-install: 1 error occurred:
      * job powerdns-zone-bootstrap failed: BackoffLimitExceeded

Pod events for powerdns-zone-bootstrap-tq7qq:
  59m Started container zone-bootstrap
  56m Back-off restarting failed container zone-bootstrap
  55m Job has reached the specified backoff limit

Root cause walked end-to-end (per CLAUDE.md TRACE rule):

  TEST: bp-powerdns HR Ready=True
    ↑
  HR: Helm install succeeds (post-install Job exits 0)
    ↑
  Zone-bootstrap Job: curl POST succeeds
    ↑
  powerdns:8081 Service: reachable (has Ready endpoints)
    ↑
  powerdns Deployment: Pods Ready (3 replicas)  ← Pending, blocked here
    ↑
  CNPG cluster: pdns-pg-app Secret exists
    ↑
  pdns-pg-1-initdb Pod: scheduled, Running, Completed  ← Pending too
    ↑
  Worker node has capacity                              ← 99% CPU requested

The zone-bootstrap container curl'd `http://powerdns:8081`, hit
"connection refused" (empty Service endpoints), exited 7, container
restarted under `restartPolicy: OnFailure`. After 6 Kubernetes-level
backoffs (≈10min wall-time with exponential delay), the Job declared
`BackoffLimitExceeded` — well before activeDeadlineSeconds=840s
(14min) could even consider firing.

Fix #144 was directionally right (the upstream IS slow on cold k3s) but
operated on the wrong knob. The container's outer-loop retry budget is
bounded by backoffLimit × backoff-delay, not by activeDeadlineSeconds.
Bumping only the deadline left the BackoffLimit ceiling unchanged.

Architectural fix (this commit):

1. Move the wait-for-API loop INSIDE the container (one Pod, one inner
   poll loop, restartPolicy=Never). The inner loop polls
   GET /api/v1/servers every 10s until HTTP 200, bounded by new
   `apiReadyTimeoutSeconds` (default 600s = 10min). Now ONE container
   run owns the full wait budget instead of N short-lived containers
   racing the backoff timer.

2. restartPolicy: OnFailure → Never. The container script handles its
   own retry; Kubernetes-level backoff is reserved for genuinely
   transient pod failures (image-pull, OS eviction) where the Job-level
   backoffLimit=6 still triggers a fresh Pod.

3. Surface POWERDNS_API_READY_TIMEOUT_S env var so operators on slower
   clusters can raise the inner deadline without forking the chart
   (per docs/INVIOLABLE-PRINCIPLES.md #4).

4. New value `zoneBootstrap.apiReadyTimeoutSeconds` (default 600s).
   Sits below activeDeadlineSeconds (840s) so the zone-creation phase
   keeps ≥240s of headroom AFTER the API comes Ready.

Curl status handling in the wait loop:
  200          → API up, proceed to bootstrap
  401|403      → auth failure, FATAL (no retry — operator misconfig)
  000|5xx|...  → transient, sleep & retry until inner deadline

Files changed:
- platform/powerdns/chart/Chart.yaml         1.2.2 → 1.2.3 + history
- platform/powerdns/chart/values.yaml        + apiReadyTimeoutSeconds knob
- platform/powerdns/chart/templates/
    zone-bootstrap-job.yaml                  inner wait-for-API loop;
                                              restartPolicy: Never
- clusters/_template/bootstrap-kit/
    11-powerdns.yaml                         pin to 1.2.3 + HR comment

Why this is sufficient where Fix #144 was not:

Fix #144 worked the chart-level deadline. This commit works the
inner-loop ownership — the wait budget is now owned by the script
inside the container, not by the Job spec arithmetic
(backoffLimit × backoff-delay). The Job's outer activeDeadlineSeconds
still caps the worst-case runtime (no runaway poll), but the script
now actually GETS to use it.

Verification:
- helm template renders cleanly (deps build OK, empty-zones short-
  circuit preserved, non-empty zones render Job + RBAC + Audit CM)
- kubectl create --dry-run=client --validate=false: 5/5 resources
  created (sa, role, rb, cm, job)
- chart 1.2.3 pinned in clusters/_template/bootstrap-kit/11-powerdns.yaml

Companion infrastructure note (NOT addressed by this commit, flagged
for Coordinator):

The DEEPER bottom of the trace stack is worker capacity. Prov #38's
single cpx32 worker (8 vCPU / 16 GB) is at 99% CPU requested. The
cluster-autoscaler attempted 2→3 scale-up but is in backoff because
two unscheduled pods (gitea/gitea-* PV affinity conflict from a
previous wedged install; trivy-system/node-collector NodeAffinity)
poison the autoscaler's "can the template node fit" check. Even with
this chart fix in place, the powerdns Deployment cannot become Ready
until either:
  (a) the worker autoscales successfully (gitea PV migrated / trivy
      taints relaxed), or
  (b) worker_count is bumped from 2 to 3 in the provisioning body, or
  (c) qa_worker_size is bumped to cpx42.

This chart fix ensures bp-powerdns survives a slow CNPG cold-start.
It does NOT fix a fundamentally undersized cluster. Coordinator next
step: reprov with worker_count=3 OR qa_worker_size=cpx42 + this chart
landed. Either should converge.

Co-authored-by: e3mrah <1234567+e3mrah@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 02:13:34 +04:00

54 lines
2.7 KiB
YAML

apiVersion: v2
name: bp-powerdns
version: 1.2.3
description: |
Catalyst-curated Blueprint wrapper for PowerDNS Authoritative.
Carries Catalyst-specific values.yaml + templates (CNPG cluster, dnsdist
companion, Traefik api-ingress, placeholder anycast endpoint, multi-zone
bootstrap Job) on top of the upstream Helm chart. The bootstrap-kit
installer (products/catalyst/bootstrap/api/internal/bootstrap/) reads
the upstream chart reference from values.yaml's catalystBlueprint
metadata block and applies the values overlay at helm install time.
Mirrors the bp-cilium / bp-keycloak / bp-cert-manager wrapper shape:
Chart.yaml lists the upstream chart as a Helm dependency so
`helm dependency build` resolves it; values.yaml carries both the
catalystBlueprint metadata block and the upstream subchart values.
1.2.3 — Fix #144 recurrence (prov #37+#38 InstallFailed
BackoffLimitExceeded, 2026-05-12). Bumping activeDeadlineSeconds alone
(Fix #144) was insufficient: the post-install hook Job's container
exited within backoffLimit=6 (~10min wall-time) because curl against
http://powerdns:8081 hit a Service with empty Ready endpoints (powerdns
Deployment Pods Pending behind a CNPG initdb that itself waited for
worker capacity). The 14m deadline never got a chance — the Job died
at BackoffLimit. Fix moves the wait-for-API loop INSIDE the container
(restartPolicy: Never) so one Pod owns the full activeDeadlineSeconds
budget. New value apiReadyTimeoutSeconds (default 600s) bounds the
inner poll; surfaced via values.yaml per INVIOLABLE-PRINCIPLES #4.
Bumped to 1.2.0 — multi-zone bootstrap (issue #827, parent epic #825).
A franchised Sovereign now supports N parent zones, NOT one. New
values key `zones: []` declares the parent domains the operator
brings to the Sovereign at signup; a Helm post-install/post-upgrade
hook Job (templates/zone-bootstrap-job.yaml) creates each one in
PowerDNS via the REST API. Idempotent on HTTP 409. Pairs with the
per-zone wildcard Certificate templates in bp-catalyst-platform 1.4.0+
and the post-provision /api/v1/sovereign/parent-domains/<name>/zone
endpoint in catalyst-api (issue #829).
type: application
keywords: [catalyst, blueprint, powerdns, dns, dnssec, lua-records, dnsdist]
maintainers:
- name: OpenOva Catalyst
email: catalyst@openova.io
# Upstream chart pulled in as a Helm subchart so `helm dependency build`
# bundles it into the OCI artifact. Pinned to pschichtel/powerdns 0.10.0
# (verified publisher on Artifact Hub, tracks docker.io/powerdns/pdns-auth-50
# at appVersion 5.0.3 — see values.yaml `catalystBlueprint.upstream` for
# the rationale).
dependencies:
- name: powerdns
version: "0.10.0"
repository: "https://schich.tel/helm-charts"