The plugin runs in four phases:
-
Access phase: Detects A2A protocol binding (JSON-RPC or REST). Starts an OpenTelemetry span when
config.logging.log_statistics is enabled. Records the request body for payload logging when config.logging.log_payloads is enabled.
-
Header filter phase: Detects streaming responses (
Content-Type: text/event-stream) and records time to first
byte (TTFB). Buffers agent card responses for URL rewriting.
-
Body filter phase: Streams SSE chunks through to the client without buffering, preserving low latency. Buffers non-streaming responses to extract task metadata. Rewrites agent card URLs to the gateway address. Emits analytics data to the Konnect pipeline at end-of-response.
-
Log phase: Finalizes the OpenTelemetry span with task state, task ID, and error information.
sequenceDiagram
autonumber
participant Client as A2A Client
participant Kong as Kong Gateway
(AI A2A Proxy)
participant Agent as Upstream A2A Agent
Client->>Kong: A2A request (JSON-RPC or REST)
Note over Kong: Detect A2A binding and method
Start OTel span (if logging enabled)
Kong->>Agent: Proxied request
(Accept-Encoding removed if logging enabled)
alt Streaming response (SSE)
Agent-->>Kong: text/event-stream chunks
Note over Kong: Pass through each chunk
Count SSE events, track TTFB
Kong-->>Client: SSE chunks (unchanged)
Note over Kong: On final chunk:
Extract task state, set analytics
else Non-streaming response
Agent->>Kong: JSON response
Note over Kong: Buffer response
Extract task metadata
Kong->>Client: Response (unchanged)
end
Note over Kong: Finish OTel span
Emit ai.a2a metrics to log plugins
The plugin auto-detects A2A traffic without requiring explicit configuration per route. It inspects each request and applies A2A processing only when a match is found; non-A2A traffic passes through without overhead. Detection works across two protocol bindings:
REST binding. The plugin detects A2A endpoints by path suffix and HTTP method. The match anchors to the end of the request path, so any prefix added by the Kong Route is ignored. For example, both /v1/message:send and /api/agents/v1/message:send match SendMessage:
|
HTTP method
|
Path suffix
|
A2A operation
|
Canonical method
|
POST
|
/v1/message:send
|
SendMessage
|
message/send
|
POST
|
/v1/message:stream
|
SendStreamingMessage
|
message/stream
|
GET
|
/.well-known/agent-card.json
|
GetAgentCard
|
agent/getCard
|
GET
|
/v1/extendedAgentCard
|
GetExtendedAgentCard
|
agent/getExtendedAgentCard
|
GET
|
/v1/tasks/{id}
|
GetTask
|
tasks/get
|
GET
|
/v1/tasks
|
ListTasks
|
tasks/list
|
POST
|
/v1/tasks/{id}:cancel
|
CancelTask
|
tasks/cancel
|
POST
|
/v1/tasks/{id}:subscribe
|
SubscribeToTask
|
tasks/resubscribe
|
POST
|
/v1/tasks
|
ListTasks
|
tasks/list
|
The canonical method name is used in OTel span attributes and log output.
JSON-RPC binding. Detected by the "jsonrpc" field in the request body, combined with a recognized method name or an A2A-Version request header. Recognized methods: message/send, message/stream, tasks/get, tasks/list, tasks/cancel, tasks/resubscribe, tasks/pushNotificationConfig/set, tasks/pushNotificationConfig/get, tasks/pushNotificationConfig/list, tasks/pushNotificationConfig/delete, agent/getExtendedAgentCard.
A request carrying an A2A-Version header is treated as JSON-RPC even if the method is not in the known list. When an unrecognized method is accepted this way, the method field in log output is recorded as "unknown" to bound metric cardinality. The OTel span’s kong.a2a.operation attribute still receives the actual method name.
When an upstream agent returns an agent card document at /.well-known/agent-card.json, the plugin rewrites the url field — and any additionalInterfaces[].url fields — to the gateway address. A2A clients then discover the gateway endpoint rather than the upstream agent’s direct address.
The rewrite uses X-Forwarded-* headers to construct the correct scheme, host, and port when Kong is deployed behind a load balancer or reverse proxy.
For responses with Content-Type: text/event-stream, the plugin passes SSE chunks through to the client without buffering, preserving low latency. It counts SSE events and extracts task state from the final event for analytics. TTFB is measured from request receipt to the first response header.