New platform/opentelemetry-operator/ Blueprint scaffold per design doc
§3.9 row 5. Companion to existing bp-opentelemetry (the collector) —
this Blueprint ships the OPERATOR that auto-injects OTel SDK sidecars
into Pods based on annotations:
instrumentation.opentelemetry.io/inject-{java|nodejs|python|dotnet}: "default"
Two-Blueprint split is intentional: collector and operator are separate
upgrade cycles. Mixing them risks coupling observability cadence to
auto-instrumentation cadence, and the operator's mutating admission
webhook intercepts every Pod creation cluster-wide so misconfiguration
is high-blast-radius.
What ships:
- platform/opentelemetry-operator/README.md — activation contract
- platform/opentelemetry-operator/blueprint.yaml — bp-opentelemetry-operator 1.0.0
- platform/opentelemetry-operator/chart/Chart.yaml — wraps upstream
opentelemetry-operator:0.61.0 from open-telemetry-helm-charts.
Subchart `condition: enabled` — default-off skips it entirely.
- platform/opentelemetry-operator/chart/values.yaml — gate, default
Instrumentation CR config (exporterEndpoint, sampler, per-language
toggles), upstream subchart values (manager.collectorImage.repository
required, serviceAccount, cert-manager-backed admission webhook)
- platform/opentelemetry-operator/chart/templates/instrumentation-default.yaml
— Catalyst overlay Instrumentation CR with parentbased_traceidratio
sampler @ 0.25 default, propagators (tracecontext + baggage + b3),
per-language injection toggles. Default OFF; namespace = cilium by
default (operator overrides per Sovereign).
Default-OFF for both layers:
- .Values.enabled: false → upstream subchart's `condition: enabled`
also fires, so 0 resources rendered total
- Even after .Values.enabled=true, the Catalyst Instrumentation CR
is gated again by .Values.defaultInstrumentation.enabled=false so
installing the chart doesn't auto-inject anywhere
Per docs/INVIOLABLE-PRINCIPLES.md #4 every parameter (sampler ratio,
exporter endpoint, per-language toggles, namespace) is in values.yaml.
Validated:
- helm dependency build pulls upstream cleanly
- helm template with default values: 0 resources rendered
- helm template with enabled=true defaultInstrumentation.enabled=true:
22 resources rendered (upstream operator manager Deployment, CRDs,
RBAC, mutating + validating webhooks, cert-manager Issuer +
Certificate, plus the Catalyst Instrumentation CR)
Out of scope for this slice:
- Add this Blueprint to clusters/_template/bootstrap-kit/ — EPIC-5
(#1100) sequences both bp-opentelemetry (collector first) and this
Blueprint as part of the observability roll-out
- Per-Application Instrumentation CRs from Blueprint.spec.observability.
traces=otlp — application-controller (slice C4 of #1095) renders
those at install time
Refs: #1094, #1095, #1100, docs/EPICS-1-6-unified-design.md §3.9 row 5
+ §8.4 (EPIC-5 Networking).
Co-authored-by: hatiyildiz <hatiyildiz@noreply.openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| chart | ||
| blueprint.yaml | ||
| README.md | ||
bp-opentelemetry-operator
Status: Phase-0 scaffold (#1095 slice H5). Activated by EPIC-5 (#1100). Updated: 2026-05-08
The OpenTelemetry Operator. Provides the Instrumentation CRD that
auto-injects OTel SDK sidecars into Pods based on annotations:
metadata:
annotations:
instrumentation.opentelemetry.io/inject-java: "true"
# or inject-dotnet / inject-nodejs / inject-python
When the annotation is set, the operator's mutating admission webhook
adds an init container that copies the OTel SDK into a shared volume and
edits the main container's env vars (OTEL_EXPORTER_OTLP_ENDPOINT,
OTEL_RESOURCE_ATTRIBUTES, OTEL_TRACES_EXPORTER, etc.) to point at the
collector deployed by bp-opentelemetry.
This Blueprint is separate from bp-opentelemetry. The latter is the
collector (DaemonSet/Deployment scraping + forwarding to Tempo/Loki/Mimir);
this one is the operator that injects per-Pod instrumentation. Two
distinct upgrade cycles, two distinct opt-ins.
What it ships
| Template | Effect |
|---|---|
Upstream opentelemetry-operator Helm subchart |
The operator Pod + Instrumentation CRD. |
instrumentation-default.yaml |
A default Instrumentation CR named default in each Org namespace. Operator + per-Org overlays opt in to Java/.NET/Node/Python auto-injection. |
Activation contract
# values.yaml override (or per-Sovereign overlay)
enabled: true
defaultInstrumentation:
enabled: true
# Where the auto-injected SDK ships traces/logs/metrics. The collector
# Service is created by bp-opentelemetry; this references it.
exporter:
endpoint: http://opentelemetry-collector.monitoring.svc:4317
java: { enabled: true, image: "ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest" }
nodejs: { enabled: true }
python: { enabled: true }
dotnet: { enabled: false }
When enabled: false (the default), no resources render — installing
this chart is a no-op until the operator opts in.
Why default-OFF
- The Operator's mutating admission webhook intercepts every Pod creation in the cluster. A misconfigured CR can break workloads cluster-wide.
- The Instrumentation CR ties traces to a collector endpoint —
bp-opentelemetry(collector) must be reconciled FIRST and reachable on the configured Service URL. - EPIC-5 (#1100) sequences both: collector first, exporters wired (Tempo/Loki/Mimir), then operator + Instrumentation CR.
References
- docs/EPICS-1-6-unified-design.md §3.9 row 5 + §8.4 (EPIC-5)
- platform/opentelemetry/README.md — the collector
- Upstream: https://github.com/open-telemetry/opentelemetry-operator