v3.13+ AI Gateway supports OpenTelemetry instrumentation for generative AI traffic. When the OpenTelemetry (OTEL) plugin is enabled in AI Gateway, a set of Gen AI-specific attributes are emitted on tracing spans. These attributes complement the core tracing instrumentations described in the Kong Gateway tracing guide, giving insight into the Gen AI request lifecycle (inputs, model, and outputs), usage, and tool/agent interactions.
v3.14+ A2A agent traffic is also instrumented via the AI A2A Proxy plugin.
You can export these attributes via a supported backend such as Jaeger configured through Kong’s OpenTelemetry plugin or the Zipkin plugin to:
- Inspect which model or provider handled a request
- Track conversation/session identifiers across requests
- Analyze prompt structure (system vs. user vs. tool messages)
- Evaluate model parameters (such as temperature, top-k)
- Measure tool-call behavior (which tools were invoked, and their metadata)
- Monitor token usage (input vs. output) for cost or performance analysis
The span data is sent to the configured OTEL endpoint through the existing tracing plugins. Use the OpenTelemetry plugin or Zipkin plugin to export these spans to backends such as Jaeger.
This page covers span attributes (per-request tracing data). AI Gateway also supports OTLP metrics (aggregated counters and histograms for latency, token usage, cost, and error rates). See the Gen AI OpenTelemetry metrics reference for details.