Deploy an OpenTelemetry collector for metrics, traces, and logs

Uses: Kong Mesh
TL;DR

Run an OpenTelemetry collector as a per-node Kubernetes DaemonSet that receives metrics, traces, and access logs from sidecars over OTLP and forwards them to your backends.

Prerequisites

This guide requires a running Kubernetes cluster. If you already have a Kubernetes cluster running, you can skip this step. It can be a cluster running locally, like Docker, or in a public cloud like AWS EKS, GCP GKE, etc.

For example, if you are using minikube:

minikube start -p mesh-zone

You will need Helm, a package manager for Kubernetes.

  1. Install Kong Mesh:

    helm repo add kong-mesh https://kong.github.io/kong-mesh-charts
    helm repo update
    helm upgrade \
      --install \
      --create-namespace \
      --namespace kong-mesh-system \
      kong-mesh kong-mesh/kong-mesh
    kubectl wait -n kong-mesh-system --for=condition=ready pod --selector=app=kong-mesh-control-plane --timeout=90s
    
  2. Apply the demo configuration:

    echo "
    apiVersion: v1
    kind: Namespace
    metadata:
      labels:
        kuma.io/sidecar-injection: enabled
      name: kong-mesh-demo
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: demo-app
      namespace: kong-mesh-demo
    spec:
      ports:
      - appProtocol: http
        port: 5050
        protocol: TCP
        targetPort: 5050
      selector:
        app: demo-app
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: demo-app-v1
      namespace: kong-mesh-demo
    spec:
      ports:
      - appProtocol: http
        port: 5050
        protocol: TCP
        targetPort: 5050
      selector:
        app: demo-app
        version: v1
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: demo-app-v2
      namespace: kong-mesh-demo
    spec:
      ports:
      - appProtocol: http
        port: 5050
        protocol: TCP
        targetPort: 5050
      selector:
        app: demo-app
        version: v2
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: kv
      namespace: kong-mesh-demo
    spec:
      ports:
      - appProtocol: http
        port: 5050
        protocol: TCP
        targetPort: 5050
      selector:
        app: kv
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: demo-app
        version: v1
      name: demo-app
      namespace: kong-mesh-demo
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: demo-app
          version: v1
      template:
        metadata:
          labels:
            app: demo-app
            version: v1
        spec:
          containers:
          - env:
            - name: OTEL_SERVICE_NAME
              value: demo-app
            - name: OTEL_EXPORTER_OTLP_ENDPOINT
              value: http://opentelemetry-collector.mesh-observability:4317
            - name: KV_URL
              value: http://kv.kong-mesh-demo.svc.cluster.local:5050
            - name: APP_VERSION
              valueFrom:
                fieldRef:
                  fieldPath: metadata.labels['version']
            image: ghcr.io/kumahq/kuma-counter-demo:latest@sha256:daf8f5cffa10b576ff845be84e4e3bd5a8a6470c7e66293c5e03a148f08ac148
            name: app
            ports:
            - containerPort: 5050
              name: http
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: demo-app
        version: v2
      name: demo-app-v2
      namespace: kong-mesh-demo
    spec:
      replicas: 0
      selector:
        matchLabels:
          app: demo-app
          version: v2
      template:
        metadata:
          labels:
            app: demo-app
            version: v2
        spec:
          containers:
          - env:
            - name: OTEL_SERVICE_NAME
              value: demo-app
            - name: OTEL_EXPORTER_OTLP_ENDPOINT
              value: http://opentelemetry-collector.mesh-observability:4317
            - name: KV_URL
              value: http://kv.kong-mesh-demo.svc.cluster.local:5050
            - name: APP_VERSION
              valueFrom:
                fieldRef:
                  fieldPath: metadata.labels['version']
            image: ghcr.io/kumahq/kuma-counter-demo:latest@sha256:daf8f5cffa10b576ff845be84e4e3bd5a8a6470c7e66293c5e03a148f08ac148
            name: demo-app
            ports:
            - containerPort: 5050
              name: http
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: kv
      namespace: kong-mesh-demo
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: kv
      template:
        metadata:
          labels:
            app: kv
        spec:
          containers:
          - env:
            - name: OTEL_SERVICE_NAME
              value: kv
            - name: OTEL_EXPORTER_OTLP_ENDPOINT
              value: http://opentelemetry-collector.mesh-observability:4317
            - name: APP_VERSION
              valueFrom:
                fieldRef:
                  fieldPath: metadata.labels['version']
            image: ghcr.io/kumahq/kuma-counter-demo:latest@sha256:daf8f5cffa10b576ff845be84e4e3bd5a8a6470c7e66293c5e03a148f08ac148
            name: app
            ports:
            - containerPort: 5050
              name: http
    ---
    apiVersion: kuma.io/v1alpha1
    kind: Mesh
    metadata:
      name: default
    spec:
      meshServices:
        mode: Exclusive
      mtls:
        backends:
        - name: ca-1
          type: builtin
        enabledBackend: ca-1
    ---
    apiVersion: kuma.io/v1alpha1
    kind: MeshTrafficPermission
    metadata:
      name: kv
      namespace: kong-mesh-demo
    spec:
      from:
      - default:
          action: Allow
        targetRef:
          kind: MeshSubset
          tags:
            app: demo-app
            k8s.kuma.io/namespace: kong-mesh-demo
      targetRef:
        kind: Dataplane
        labels:
          app: kv" | kubectl apply -f -
    kubectl wait -n kong-mesh-demo --for=condition=available --timeout=120s deployment --all
    

Install Grafana Tempo as the trace backend. The collector config in this guide pushes traces to tempo.observability:4317:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

echo "
tempo:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317
        http:
          endpoint: 0.0.0.0:4318
" > values-tempo.yaml

helm install tempo grafana/tempo \
  --namespace observability --create-namespace \
  -f values-tempo.yaml

kubectl wait -n observability --for=condition=ready pod \
  -l app.kubernetes.io/name=tempo --timeout=120s

Install Grafana Loki as the log backend. The collector config in this guide pushes logs to http://loki.observability:3100/otlp:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

echo "
deploymentMode: SingleBinary
loki:
  auth_enabled: false
  commonConfig:
    replication_factor: 1
  storage:
    type: filesystem
  schemaConfig:
    configs:
      - from: '2024-01-01'
        store: tsdb
        object_store: filesystem
        schema: v13
        index:
          prefix: loki_index_
          period: 24h
singleBinary:
  replicas: 1
read:
  replicas: 0
write:
  replicas: 0
backend:
  replicas: 0
chunksCache:
  enabled: false
resultsCache:
  enabled: false
" > values-loki.yaml

helm install loki grafana/loki \
  --namespace observability --create-namespace \
  -f values-loki.yaml

Install Prometheus with a scrape job for the collector’s /metrics endpoint at otel-collector.observability:8889:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

echo "
extraScrapeConfigs: |
  - job_name: otel-collector
    static_configs:
      - targets: ['otel-collector.observability:8889']
" > values-prometheus.yaml

helm install prometheus prometheus-community/prometheus \
  --namespace observability --create-namespace \
  -f values-prometheus.yaml

This guide deploys an OpenTelemetry collector as a per-node Kubernetes DaemonSet that receives all three telemetry signals from Kong Mesh: metrics from MeshMetric, traces from MeshTrace, and access logs from MeshAccessLog. Sidecars push to it over OTLP gRPC on port 4317. It also covers what to change when mesh passthrough is off.

For background on the push model and topology trade-offs, see OpenTelemetry collector in the observability reference docs.

Deploy the collector

  1. Create a dedicated namespace for the collector:

    kubectl create namespace observability --dry-run=client -o yaml | kubectl apply -f -
    

    The collector Pod must run without a sidecar, otherwise the sidecar would push telemetry back through the collector it runs alongside, creating a circular dependency.

  2. Exclude the namespace from sidecar injection:

    kubectl label namespace observability kuma.io/sidecar-injection=disabled
    
  3. Apply the collector configuration:

    echo "apiVersion: v1
    kind: ConfigMap
    metadata:
      name: otel-collector-config
      namespace: observability
    data:
      config.yaml: |
        receivers:
          otlp:
            protocols:
              grpc:
                endpoint: 0.0.0.0:4317
              http:
                endpoint: 0.0.0.0:4318
    
        processors:
          memory_limiter:
            check_interval: 5s
            limit_mib: 500
            spike_limit_mib: 400
          batch:
            send_batch_size: 4096
            send_batch_max_size: 8192
            timeout: 10s
    
        exporters:
          debug:
            verbosity: basic
          otlp/tempo:
            endpoint: tempo.observability:4317
            tls:
              insecure: true
          prometheus:
            endpoint: 0.0.0.0:8889
          otlphttp/loki:
            endpoint: http://loki.observability:3100/otlp
    
        service:
          pipelines:
            traces:
              receivers: [otlp]
              processors: [memory_limiter, batch]
              exporters: [otlp/tempo, debug]
            metrics:
              receivers: [otlp]
              processors: [memory_limiter, batch]
              exporters: [prometheus, debug]
            logs:
              receivers: [otlp]
              processors: [memory_limiter, batch]
              exporters: [otlphttp/loki, debug]" | kubectl apply -f -
    

    The configuration defines three pipelines: traces, metrics, and logs. It also runs a memory limiter, a tuned batch processor, and a debug exporter on every pipeline so you can see telemetry flowing during testing.

    Notes on this configuration:

    • memory_limiter runs first. The OpenTelemetry project recommends this order so the collector can shed load before later processors allocate memory. If batching ran first, a burst could OOM the pod before the limiter ever saw it.
    • batch reduces export overhead. send_batch_size: 4096 is a reasonable starting point. Tune up if your backend complains about request rate, down if it complains about batch size.
    • The debug exporter runs in every pipeline at verbosity: basic so each batch shows up as one log line. Drop it from the pipelines once you’ve verified the setup, or raise it to verbosity: detailed when you need to see individual records.
    • otlp/tempo, otlphttp/loki, and prometheus are examples. The trace and log exporters send OTLP to a backend; the prometheus exporter exposes a /metrics endpoint on port 8889 for Prometheus to scrape. Swap the addresses to match your own backends if needed.
    • tls.insecure: true on the Tempo exporter disables certificate verification for the in-cluster example. In production, point the exporter at a TLS endpoint with a trusted CA and remove the insecure flag.
  4. Apply the DaemonSet and node-local service:

    echo "apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: otel-collector
      namespace: observability
    spec:
      selector:
        matchLabels:
          app: otel-collector
      template:
        metadata:
          labels:
            app: otel-collector
        spec:
          containers:
            - name: otel-collector
              image: otel/opentelemetry-collector-contrib:0.141.0
              args: ['--config=/conf/config.yaml']
              ports:
                - name: otlp-grpc
                  containerPort: 4317
                - name: otlp-http
                  containerPort: 4318
                - name: prometheus
                  containerPort: 8889
              resources:
                requests:
                  cpu: 100m
                  memory: 256Mi
                limits:
                  cpu: 500m
                  memory: 512Mi
              volumeMounts:
                - name: config
                  mountPath: /conf
          volumes:
            - name: config
              configMap:
                name: otel-collector-config
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: otel-collector
      namespace: observability
    spec:
      selector:
        app: otel-collector
      internalTrafficPolicy: Local
      ports:
        - name: otlp-grpc
          port: 4317
          targetPort: otlp-grpc
          appProtocol: grpc
        - name: otlp-http
          port: 4318
          targetPort: otlp-http
        - name: prometheus
          port: 8889
          targetPort: prometheus" | kubectl apply -f -
    

    Sidecars resolve otel-collector.observability:4317 to whichever collector Pod runs on their node.

    internalTrafficPolicy: Local keeps the hop node-local but does not fail over to another node. If the collector Pod on a node restarts, that node’s telemetry drops until the Pod is ready.

  5. Wait for the collector to be ready:

    kubectl wait -n observability --for=condition=ready pod -l app=otel-collector --timeout=120s
    

Point Kong Mesh policies at the collector

All three policies use the same endpoint. Apply them at the Mesh level to cover every sidecar in the mesh:

echo "apiVersion: kuma.io/v1alpha1
kind: MeshMetric
metadata:
  name: all-metrics
  namespace: kong-mesh-system
  labels:
    kuma.io/mesh: default
spec:
  targetRef:
    kind: Mesh
  default:
    backends:
      - type: OpenTelemetry
        openTelemetry:
          endpoint: otel-collector.observability:4317
---
apiVersion: kuma.io/v1alpha1
kind: MeshTrace
metadata:
  name: all-traces
  namespace: kong-mesh-system
  labels:
    kuma.io/mesh: default
spec:
  targetRef:
    kind: Mesh
  default:
    backends:
      - type: OpenTelemetry
        openTelemetry:
          endpoint: otel-collector.observability:4317
    sampling:
      overall: 100
---
apiVersion: kuma.io/v1alpha1
kind: MeshAccessLog
metadata:
  name: all-access-logs
  namespace: kong-mesh-system
  labels:
    kuma.io/mesh: default
spec:
  targetRef:
    kind: Mesh
  rules:
    - default:
        backends:
          - type: OpenTelemetry
            openTelemetry:
              endpoint: otel-collector.observability:4317" | kubectl apply -f -

The MeshTrace policy samples 100% of traces so you see something during testing. Drop the rate to single digits in production.

Reach the collector when passthrough is off

By default, sidecars reach the collector through passthrough mode. If you’ve disabled passthrough on the Mesh, declare the collector with a MeshExternalService so sidecars can still reach it:

MeshExternalService requires ZoneEgress and mutual TLS on the mesh. If you already disabled passthrough, you likely already have mTLS enabled.

echo "apiVersion: kuma.io/v1alpha1
kind: MeshExternalService
metadata:
  name: otel-collector
  namespace: kong-mesh-system
  labels:
    kuma.io/mesh: default
spec:
  match:
    type: HostnameGenerator
    port: 4317
    protocol: grpc
  endpoints:
    - address: otel-collector.observability
      port: 4317" | kubectl apply -f -

The hostname generator publishes the service under otel-collector.extsvc.mesh.local. Update the three policies to point at that hostname on port 4317 instead of otel-collector.observability:4317.

Verify the collector

  1. Check that the collector is receiving metrics:

    kubectl logs -n observability -l app=otel-collector --tail=20
    

    With the debug exporter at verbosity: basic, each batch shows up as one line per signal. Metrics flow continuously from Envoy stats, so you should see Metrics lines within a minute or so.

  2. List the collector Pods with their node assignments:

    kubectl get pod -n observability -o wide -l app=otel-collector
    
  3. Inspect the endpoint slice to confirm traffic is going node-local:

    kubectl get endpointslice -n observability -l kubernetes.io/service-name=otel-collector -o yaml
    

    The endpoint slice lists one collector Pod per node, and each node’s kube-proxy only routes to its own entry.

Generate traffic

  1. Port-forward the demo app service on port 5050:

    kubectl port-forward svc/demo-app -n kong-mesh-demo 5050:5050
    
  2. Go to http://127.0.0.1:5050 to open the demo app UI.

  3. Enable Auto-increment to generate traffic.

Validate

In a new terminal, re-check the collector logs and confirm all three signals appear:

kubectl logs -n observability -l app=otel-collector --tail=20

You should now see Traces and Logs lines alongside Metrics. If Metrics appears but one of the others is missing, the corresponding MeshTrace or MeshAccessLog policy isn’t matching. In that case, verify the targetRef.

FAQs

Walk back through these checks:

  • Did you apply the policy to the right Mesh?
  • Does the collector Pod’s address match the policy endpoint?
  • Can a debug Pod in a mesh namespace reach otel-collector.observability:4317 on TCP?

Use a Deployment for small and medium clusters, or any cluster where collector throughput isn’t a bottleneck. See OpenTelemetry collector topologies for the trade-offs.

Replace the DaemonSet workload and service in Deploy the collector with a Deployment behind a ClusterIP service:

echo "apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector
  namespace: observability
spec:
  replicas: 2
  selector:
    matchLabels:
      app: otel-collector
  template:
    metadata:
      labels:
        app: otel-collector
    spec:
      containers:
        - name: otel-collector
          image: otel/opentelemetry-collector-contrib:0.141.0
          args: ['--config=/conf/config.yaml']
          ports:
            - name: otlp-grpc
              containerPort: 4317
            - name: otlp-http
              containerPort: 4318
            - name: prometheus
              containerPort: 8889
          resources:
            requests:
              cpu: 100m
              memory: 256Mi
            limits:
              cpu: 500m
              memory: 512Mi
          volumeMounts:
            - name: config
              mountPath: /conf
      volumes:
        - name: config
          configMap:
            name: otel-collector-config
---
apiVersion: v1
kind: Service
metadata:
  name: otel-collector
  namespace: observability
spec:
  selector:
    app: otel-collector
  ports:
    - name: otlp-grpc
      port: 4317
      targetPort: otlp-grpc
      appProtocol: grpc
    - name: otlp-http
      port: 4318
      targetPort: otlp-http
    - name: prometheus
      port: 8889
      targetPort: prometheus" | kubectl apply -f -

Sidecars resolve otel-collector.observability:4317 to the Service IP, and Kubernetes forwards traffic to one of the collector replicas. With a Deployment, you can skip the node-local checks (Pod node assignments and endpoint slice) in Verify the collector.

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!