In Kong Gateway, metrics are natively supported by the OpenTelemetry plugin.
You can send metrics using the parameters under config.metrics.
To collect AI OTel metrics for AI Gateway, you will need to use the OpenTelemetry plugin with one of the AI Proxy or AI Proxy Advanced plugins. See Gen AI OpenTelemetry metrics reference for details.
The following metrics are exposed:
http.server.request.count
Total number of incoming HTTP requests.
-
Instrument unit:
{request} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.auth.consumer.nameName of the authenticated Consumer. kong.response.sourceOrigin of the current response. Possible values: -
upstreamif the response originated by successfully contacting the upstream service -
kongotherwise
kong.workspace.nameName of the Workspace. http.request.methodMethod used in the HTTP request. kong.response.status_codeHTTP status code of the response. -
kong.latency.total
Complete end-to-end duration of a request in seconds.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace.
kong.latency.internal
Kong’s internal processing time in seconds, from when the Gateway receives the request from the client to when it sends the request to the upstream service.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace.
kong.latency.upstream
Upstream processing time in seconds, from when the Gateway sends the request to the upstream, to when the data is returned to Kong.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace.
http.server.request.size
Size of each incoming HTTP request in bytes.
-
Instrument unit:
By -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.auth.consumer.nameName of the authenticated Consumer. kong.workspace.nameName of the Workspace.
http.server.response.size
Total size of the HTTP response sent back to the client in bytes.
-
Instrument unit:
By -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.auth.consumer.nameName of the authenticated Consumer. kong.workspace.nameName of the Workspace.
kong.shared_dict.usage
Current memory usage of a shared dict in bytes.
-
Instrument unit:
By -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.shared_dict.nameName of the shared dict. kong.subsystemNginx subsystem that produced the metric. Possible values: httpstream
kong.shared_dict.size
Total memory size of a shared dict in bytes.
-
Instrument unit:
By -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.shared_dict.nameName of the shared dict. kong.subsystemNginx subsystem that produced the metric. Possible values: httpstream
kong.memory.workers.lua_vm
Memory used by the worker’s Lua VM in bytes.
-
Instrument unit:
By -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.pidWorker process ID. kong.subsystemNginx subsystem that produced the metric. Possible values: httpstream
kong.nginx.connection.count
Number of client connections in Nginx.
-
Instrument unit:
{connection} -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.subsystemNginx subsystem that produced the metric. Possible values: httpstream
kong.connection.stateState of the client connection. Possible values: acceptedhandledtotalactivereadingwritingwaiting
kong.nginx.timer.count
Number of internal scheduled timers Nginx is running in the background.
-
Instrument unit:
{timer} -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.timer.stateState of the timer. Possible values: pendingrunning
kong.db.connection.status
Shows whether Kong has an active database connection. A value of 1 means connected. A value of 0 means not connected.
-
Instrument unit:
1 -
Instrument type:
Gauge - No attributes
kong.cp.connection.status
Shows whether the data plane has an active connection to the control plane. A value of 1 means connected. A value of 0 means not connected.
-
Instrument unit:
1 -
Instrument type:
Gauge - No attributes
kong.upstream.target.status
Upstream target’s health. The actual status is in the state attribute, with the metric value set to 1 when a state is populated.
-
Instrument unit:
1 -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.upstream.nameName of the Upstream. kong.target.addressAddress of the Target. server.addressAddress of the server. kong.upstream.stateHealth of the Upstream Target. Possible values: healthyunhealthydns_error
kong.subsystemNginx subsystem that produced the metric. Possible values: httpstream
kong.dp.cluster_cert.expiry
Timestamp when the data plane’s cluster certificate will expire.
-
Instrument unit:
s -
Instrument type:
Gauge - No attributes
kong.db.entity.count
Number of entities stored in Kong’s database.
-
Instrument unit:
{entity} -
Instrument type:
Gauge - No attributes
kong.db.entity.error.count
Number of errors seen during database entity count collection.
-
Instrument unit:
{error} -
Instrument type:
Sum - No attributes
kong.ee.license.signature
Last 8 bytes of the Enterprise license signature as a number.
-
Instrument type:
Gauge - No attributes
kong.ee.license.expiration
Unix epoch time when the license expires, shifted by 24 hours to avoid timezone differences.
-
Instrument unit:
s -
Instrument type:
Gauge - No attributes
kong.ee.license.features
Indicates whether the data plane can read or write entities under the current license.
Each capability (ee_entity_read and ee_entity_write) is reported as its own metric, where 1 means allowed and 0 means not allowed.
-
Instrument unit:
1 -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.ee.license.featureEnterprise feature. Possible values: ee_entity_readee_entity_write
kong.ee.license.error.count
Number of errors that occurred while collecting license information.
-
Instrument unit:
{error} -
Instrument type:
Sum - No attributes
kong.websocket.server.total_connections v3.14+
Total number of incoming WebSocket connections, including active and previously closed connections.
-
Instrument unit:
{request} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer.
kong.websocket.server.active_connections v3.14+
Number of currently active WebSocket connections.
-
Instrument unit:
{request} -
Instrument type:
Gauge -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer.
kong.websocket.server.received.message.size v3.14+
Size of incoming WebSocket message bodies in bytes, from the client.
-
Instrument unit:
By -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer.
kong.websocket.server.sent.message.size v3.14+
Size of WebSocket messages sent back to the client in bytes.
-
Instrument unit:
By -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer.
kong.websocket.handshake.error.count v3.14+
Number of errors seen during the WebSocket handshake.
-
Instrument unit:
{error} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
error.typeType of error that occurred.
kong.websocket.connection.duration v3.14+
Duration of successfully established WebSocket connections in seconds.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer.
gen_ai.client.operation.duration v3.14+
Total time Kong spends processing a Gen AI operation, such as an LLM request. Requires enable_request_metrics to populate the error.type attribute.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.error.typeType of error that occurred.
gen_ai.server.request.duration v3.14+
Time the LLM provider spends processing the request. Requires enable_latency_metrics set to true. Requires enable_request_metrics to populate the error.type attribute.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.error.typeType of error that occurred.
gen_ai.client.token.usage v3.14+
Number of tokens consumed by the Gen AI operation.
-
Instrument unit:
{token} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.token.typeToken category: input,output, ortotal.gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
gen_ai.server.time_to_first_token v3.14+
Time from when the model server receives the request until the first output token is generated.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
gen_ai.server.time_per_output_token v3.14+
Time between successive output tokens generated by the model server after the first token.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
kong.gen_ai.llm.cost v3.14+
Cost of AI requests.
-
Instrument unit:
{cost} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.gen_ai.cache.statusCache status: hitor empty if not cached.kong.gen_ai.vector_dbVector database used for caching, such as redis.kong.gen_ai.embeddings.providerEmbeddings provider used for caching. kong.gen_ai.embeddings.modelEmbeddings model used for caching. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
kong.gen_ai.cache.fetch.latency v3.14+
Time to fetch a response from the semantic cache.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.gen_ai.cache.statusCache status: hitor empty if not cached.kong.gen_ai.vector_dbVector database used for caching, such as redis.kong.gen_ai.embeddings.providerEmbeddings provider used for caching. kong.gen_ai.embeddings.modelEmbeddings model used for caching. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
kong.gen_ai.cache.embeddings.latency v3.14+
Time to generate embeddings during cache operations.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.gen_ai.cache.statusCache status: hitor empty if not cached.kong.gen_ai.vector_dbVector database used for caching, such as redis.kong.gen_ai.embeddings.providerEmbeddings provider used for caching. kong.gen_ai.embeddings.modelEmbeddings model used for caching. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
kong.gen_ai.rag.fetch.latency v3.14+
Time to fetch data from a RAG source.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.gen_ai.cache.statusCache status: hitor empty if not cached.kong.gen_ai.vector_dbVector database used for caching, such as redis.kong.gen_ai.embeddings.providerEmbeddings provider used for caching. kong.gen_ai.embeddings.modelEmbeddings model used for caching. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
kong.gen_ai.rag.embeddings.latency v3.14+
Time to generate embeddings for RAG operations.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
gen_ai.provider.nameName of the Gen AI provider. gen_ai.request.modelModel name targeted by the request. gen_ai.response.modelModel name reported by the provider in the response. gen_ai.operation.nameOperation requested, such as chatorembeddings.kong.gen_ai.cache.statusCache status: hitor empty if not cached.kong.gen_ai.vector_dbVector database used for caching, such as redis.kong.gen_ai.embeddings.providerEmbeddings provider used for caching. kong.gen_ai.embeddings.modelEmbeddings model used for caching. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer. kong.gen_ai.request.modeRequest mode: oneshot,stream, orrealtime.
kong.gen_ai.aws.guardrails.latency v3.14+
Time for AWS Guardrails to process a request.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.gen_ai.aws.guardrails.idID of the AWS Guardrails configuration. kong.gen_ai.aws.guardrails.versionVersion of the AWS Guardrails configuration. kong.gen_ai.aws.guardrails.modeMode of the AWS Guardrails evaluation. kong.gen_ai.aws.guardrails.regionAWS region of the Guardrails service. kong.workspace.nameName of the Workspace. kong.auth.consumer.nameName of the authenticated Consumer.
kong.gen_ai.mcp.response.size v3.14+
Size of the MCP response body in bytes.
-
Instrument unit:
By -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. mcp.method.nameMCP method name, such as tools/call.gen_ai.tool.nameName of the MCP tool invoked.
kong.gen_ai.mcp.request.error.count v3.14+
Number of MCP request errors.
-
Instrument unit:
{error} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. mcp.method.nameMCP method name, such as tools/call.gen_ai.tool.nameName of the MCP tool invoked. error.typeType of error that occurred.
mcp.client.operation.duration v3.14+
Duration of the MCP request as observed by the sender. Only available when the AI MCP Proxy plugin is in passthrough-listener mode. Requires enable_latency_metrics set to true.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. mcp.method.nameMCP method name, such as tools/call.gen_ai.tool.nameName of the MCP tool invoked. error.typeType of error that occurred. gen_ai.operation.nameOperation requested, such as chatorembeddings.
mcp.server.operation.duration v3.14+
Duration of the MCP request as observed by the receiver.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. mcp.method.nameMCP method name, such as tools/call.gen_ai.tool.nameName of the MCP tool invoked. error.typeType of error that occurred. gen_ai.operation.nameOperation requested, such as chatorembeddings.
kong.gen_ai.mcp.acl.allowed v3.14+
Number of MCP requests allowed by ACL rules.
-
Instrument unit:
{request} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.mcp.primitiveMCP primitive type, such as tool.kong.gen_ai.mcp.primitive_nameName of the MCP primitive.
kong.gen_ai.mcp.acl.denied v3.14+
Number of MCP requests denied by ACL rules.
-
Instrument unit:
{request} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.mcp.primitiveMCP primitive type, such as tool.kong.gen_ai.mcp.primitive_nameName of the MCP primitive.
kong.gen_ai.a2a.request.count v3.14+
Total number of A2A requests.
-
Instrument unit:
{request} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.a2a.methodA2A method name. kong.gen_ai.a2a.bindingA2A binding type.
kong.gen_ai.a2a.request.duration v3.14+
Duration of an A2A request in seconds.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.a2a.methodA2A method name. kong.gen_ai.a2a.bindingA2A binding type.
kong.gen_ai.a2a.response.size v3.14+
Size of the A2A response body in bytes.
-
Instrument unit:
By -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.a2a.methodA2A method name. kong.gen_ai.a2a.bindingA2A binding type.
kong.gen_ai.a2a.ttfb v3.14+
Time to first byte for A2A streaming responses in seconds.
-
Instrument unit:
s -
Instrument type:
Histogram -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.a2a.methodA2A method name. kong.gen_ai.a2a.bindingA2A binding type.
kong.gen_ai.a2a.request.error.count v3.14+
Number of A2A request errors.
-
Instrument unit:
{error} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.a2a.methodA2A method name. kong.gen_ai.a2a.bindingA2A binding type. kong.gen_ai.a2a.error.typeType of the A2A error.
kong.gen_ai.a2a.task.state.count v3.14+
Number of A2A task state transitions.
-
Instrument unit:
{state} -
Instrument type:
Sum -
Attributes:
Attribute
Attribute description
kong.service.nameName of the Gateway Service. kong.route.nameName of the Route. kong.workspace.nameName of the Workspace. kong.gen_ai.a2a.task.stateTask state, such as completed,failed, orin_progress.