To set up an OpenTelemetry backend, you need support for OTLP over HTTP with Protobuf encoding. You can:
-
Send data directly to an OpenTelemetry-compatible backend that natively supports OTLP over HTTP with Protobuf encoding, like Jaeger (v1.35.0+).
This is the simplest setup, since it doesn’t require any additional components between the data plane and the backend.
-
Use the OpenTelemetry Collector, which acts as an intermediary between the data plane and one or more backends.
OTEL Collector can receive all OpenTelemetry signals supported by the OpenTelemetry plugin, including traces, metrics, and logs, and then process, transform, or route that data before exporting it to a compatible backend.
This option is useful when you need capabilities such as signal fan-out, filtering, enrichment, batching, or exporting to multiple backends. The OpenTelemetry Collector supports a wide range of exporters, available at open-telemetry/opentelemetry-collector-contrib.
Check OpenTelemetry and Kong Gateway tracing documentation for more details about OpenTelemetry and tracing in Kong Gateway.
These attributes identify the Gen AI provider and the type of operation requested (such as chat completion or embeddings generation).
|
Key
|
Value Type
|
Description
|
gen_ai.operation.name
|
string
|
Operation requested from the provider, such as chat or embeddings.
|
gen_ai.provider.name
|
string
|
Name of the Generative AI provider handling the request.
|
These attributes capture model configuration parameters sent with the request. They control generation behavior such as randomness, token limits, and sampling strategies.
|
Key
|
Value Type
|
Description
|
gen_ai.request.choice.count
|
int
|
Number of result candidates requested in a response.
|
gen_ai.request.encoding_formats
|
string[]
|
Requested encoding formats for embeddings results.
|
gen_ai.request.frequency_penalty
|
double
|
Penalty that reduces repetition of frequent tokens.
|
gen_ai.request.max_tokens
|
int
|
Maximum number of tokens the model may generate.
|
gen_ai.request.model
|
string
|
Model name targeted by the request.
|
gen_ai.request.presence_penalty
|
double
|
Penalty that reduces repetition of new tokens.
|
gen_ai.request.seed
|
int
|
Seed value that increases response reproducibility.
|
gen_ai.request.stop_sequences
|
string[]
|
Token sequences that stop further generation.
|
gen_ai.request.temperature
|
double
|
Randomness factor for generated results.
|
gen_ai.request.top_k
|
double
|
Top-k sampling configuration limiting candidate tokens.
|
gen_ai.request.top_p
|
double
|
Probability threshold applied during nucleus sampling.
|
These attributes contain the actual input and output messages exchanged with the model, along with output format specifications and system-level instructions. Payload attributes are only emitted when payload logging is enabled.
The gen_ai.input.messages and gen_ai.output.messages attributes log full request and response payloads. These may contain personally identifiable information (PII), credentials, or other sensitive data.
Make sure your tracing backend has appropriate access controls and retention policies before enabling payload logging.
Attributes with the any type contain JSON-serialized objects. The structure follows the message format of the underlying provider API (for example, OpenAI’s chat completion message schema).
|
Key
|
Value Type
|
Description
|
gen_ai.input.messages
|
any
|
Structured messages sent as input when payload logging is enabled.
|
gen_ai.output.messages
|
any
|
Structured messages returned by the model when payload logging is enabled.
|
gen_ai.output.type
|
string
|
Requested output format, such as text or JSON.
|
gen_ai.system_instructions
|
string
|
System-level instructions provided to steer model behavior.
|
These attributes capture metadata from the model’s response, including token consumption metrics used for cost analysis and performance monitoring.
|
Key
|
Value Type
|
Description
|
gen_ai.response.finish_reasons
|
string[]
|
Reasons returned for why token generation stopped.
|
gen_ai.response.id
|
string
|
Unique identifier assigned to the completion by the provider.
|
gen_ai.response.model
|
string
|
Model name reported by the provider in the response.
|
gen_ai.usage.input_tokens
|
int
|
Number of tokens processed as input to the model.
|
gen_ai.usage.output_tokens
|
int
|
Number of tokens generated by the model in the response.
|
These attributes provide context for advanced Gen AI features such as tool calling, agent-based architectures, and data source grounding.
|
Key
|
Value Type
|
Description
|
gen_ai.agent.description
|
string
|
Description of the agent’s purpose or role.
|
gen_ai.agent.id
|
string
|
Identifier representing the application-defined agent.
|
gen_ai.token.type
|
string
|
Token counting strategy used for the request.
|
gen_ai.tool.call.id
|
string
|
Unique identifier assigned to a tool call from the model.
|
gen_ai.tool.description
|
string
|
Description of the tool being invoked.
|
gen_ai.tool.name
|
string
|
Name of the tool invoked by the model.
|
gen_ai.tool.type
|
string
|
Type of tool invoked, such as function.
|