AI A2A Proxy

AI License Required
Related Documentation
Made by
Kong Inc.
Supported Gateway Topologies
hybrid db-less traditional
Supported Konnect Deployments
hybrid cloud-gateways serverless
Compatible Protocols
grpc grpcs http https
Minimum Version
Kong Gateway - 3.14
AI Gateway Enterprise: This plugin is only available as part of our AI Gateway Enterprise offering.

The AI A2A Proxy plugin provides observability and control for Agent-to-Agent (A2A) protocol traffic routed through AI Gateway. It detects and processes A2A requests using both JSON-RPC and REST protocol bindings, rewrites agent card URLs to the gateway address, and feeds structured A2A metrics into the Konnect analytics pipeline and OpenTelemetry tracing.

The plugin operates as a transparent proxy. It does not modify request routing, aggregate responses, or manage task state. When config.logging.log_statistics is enabled, it removes the Accept-Encoding request header to prevent compressed upstream responses. Agent card responses have their url field rewritten to the AI Gateway address; all other traffic passes through without modification.

Core protocol elements

The A2A protocol defines the following fundamental communication elements between Agents. The plugin surfaces data tied to these elements in its log output and OpenTelemetry spans.

Element

Description

Purpose

Agent Card A JSON metadata document describing an agent’s identity, capabilities, endpoint, skills, and authentication requirements. Enables clients to discover agents and understand how to interact with them.
Task A stateful unit of work initiated by an agent, with a unique ID and defined lifecycle. Tracks long-running operations and supports multi-turn interactions.
Message A single turn of communication between a client and an agent, containing content and a role (user or agent). Conveys instructions, context, questions, answers, or status updates that are not formal artifacts.
Part The fundamental content container (for example, TextPart, FilePart, DataPart) used within messages and artifacts. Provides flexibility for agents to exchange different content types within messages and artifacts.
Artifact A tangible output generated by an agent during a task (for example, a document, image, or structured data). Carries the concrete output of a task in a structured, retrievable form.

How it works

The plugin runs in four phases:

  • Access phase: Detects A2A protocol binding (JSON-RPC or REST). Starts an OpenTelemetry span when config.logging.log_statistics is enabled. Records the request body for payload logging when config.logging.log_payloads is enabled.
  • Header filter phase: Detects streaming responses (Content-Type: text/event-stream) and records time to first byte (TTFB). Buffers agent card responses for URL rewriting.
  • Body filter phase: Streams SSE chunks through to the client without buffering, preserving low latency. Buffers non-streaming responses to extract task metadata. Rewrites agent card URLs to the gateway address. Emits analytics data to the Konnect pipeline at end-of-response.
  • Log phase: Finalizes the OpenTelemetry span with task state, task ID, and error information.
 
sequenceDiagram
    autonumber
    participant Client as A2A Client
    participant Kong as Kong Gateway
(AI A2A Proxy) participant Agent as Upstream A2A Agent Client->>Kong: A2A request (JSON-RPC or REST) Note over Kong: Detect A2A binding and method
Start OTel span (if logging enabled) Kong->>Agent: Proxied request
(Accept-Encoding removed if logging enabled) alt Streaming response (SSE) Agent-->>Kong: text/event-stream chunks Note over Kong: Pass through each chunk
Count SSE events, track TTFB Kong-->>Client: SSE chunks (unchanged) Note over Kong: On final chunk:
Extract task state, set analytics else Non-streaming response Agent->>Kong: JSON response Note over Kong: Buffer response
Extract task metadata Kong->>Client: Response (unchanged) end Note over Kong: Finish OTel span
Emit ai.a2a metrics to log plugins

A2A protocol detection

The plugin auto-detects A2A traffic without requiring explicit configuration per route. It inspects each request and applies A2A processing only when a match is found; non-A2A traffic passes through without overhead. Detection works across two protocol bindings:

REST binding. The plugin detects A2A endpoints by path suffix and HTTP method. The match anchors to the end of the request path, so any prefix added by the Kong Route is ignored. For example, both /v1/message:send and /api/agents/v1/message:send match SendMessage:

HTTP method

Path suffix

A2A operation

Canonical method

POST /v1/message:send SendMessage message/send
POST /v1/message:stream SendStreamingMessage message/stream
GET /.well-known/agent-card.json GetAgentCard agent/getCard
GET /v1/extendedAgentCard GetExtendedAgentCard agent/getExtendedAgentCard
GET /v1/tasks/{id} GetTask tasks/get
GET /v1/tasks ListTasks tasks/list
POST /v1/tasks/{id}:cancel CancelTask tasks/cancel
POST /v1/tasks/{id}:subscribe SubscribeToTask tasks/resubscribe
POST /v1/tasks ListTasks tasks/list

The canonical method name is used in OTel span attributes and log output.

JSON-RPC binding. Detected by the "jsonrpc" field in the request body, combined with a recognized method name or an A2A-Version request header. Recognized methods: message/send, message/stream, tasks/get, tasks/list, tasks/cancel, tasks/resubscribe, tasks/pushNotificationConfig/set, tasks/pushNotificationConfig/get, tasks/pushNotificationConfig/list, tasks/pushNotificationConfig/delete, agent/getExtendedAgentCard.

A request carrying an A2A-Version header is treated as JSON-RPC even if the method is not in the known list. When an unrecognized method is accepted this way, the method field in log output is recorded as "unknown" to bound metric cardinality. The OTel span’s kong.a2a.operation attribute still receives the actual method name.

Agent card URL rewriting

When an upstream agent returns an agent card document at /.well-known/agent-card.json, the plugin rewrites the url field — and any additionalInterfaces[].url fields — to the gateway address. A2A clients then discover the gateway endpoint rather than the upstream agent’s direct address.

The rewrite uses X-Forwarded-* headers to construct the correct scheme, host, and port when Kong is deployed behind a load balancer or reverse proxy.

Streaming support

For responses with Content-Type: text/event-stream, the plugin passes SSE chunks through to the client without buffering, preserving low latency. It counts SSE events and extracts task state from the final event for analytics. TTFB is measured from request receipt to the first response header.

Configuration

Logging and observability

Enable observability with config.logging.log_statistics. When enabled, the plugin:

  • Starts an OpenTelemetry span per A2A request
  • Records A2A method, binding type, task state, task ID, context ID, latency, TTFB (streaming), SSE event count, and response size
  • Emits structured data to the ai.a2a namespace consumed by Konnect analytics and attached Kong log plugins

When log_statistics is enabled, the plugin removes the Accept-Encoding request header before forwarding to the upstream agent. This prevents compressed responses that the plugin cannot parse for metadata extraction.

To also capture request and response bodies, enable config.logging.log_payloads. This field requires log_statistics to also be enabled. Payloads are truncated at config.logging.max_payload_size (default 1 MB). max_payload_size must be greater than 0; set max_request_body_size to 0 instead if you need unlimited capture for request detection.

Payload logging may expose sensitive data. Enable with care in production environments.

Log output fields

When config.logging.log_statistics is enabled, the plugin writes the following fields to the ai.a2a.rpc[] array:

Field

Type

Description

ai.a2a.rpc[].method string A2A operation name
ai.a2a.rpc[].binding string Protocol binding: jsonrpc or rest
ai.a2a.rpc[].latency number End-to-end proxy latency in milliseconds
ai.a2a.rpc[].id string Request ID (JSON-RPC) or task ID (REST)
ai.a2a.rpc[].task_id string Task ID extracted from the response
ai.a2a.rpc[].task_state string Normalized task state (see task states)
ai.a2a.rpc[].context_id string A2A context ID extracted from the response
ai.a2a.rpc[].error string Error type string when the upstream returned an error
ai.a2a.rpc[].response_body_size number Response body size in bytes
ai.a2a.rpc[].streaming boolean true for SSE streaming responses
ai.a2a.rpc[].ttfb_latency number Time to first byte in milliseconds (streaming only)
ai.a2a.rpc[].sse_events_count number Count of SSE data: events received (streaming only)
ai.a2a.rpc[].payload.request string Request body (only when log_payloads is enabled)
ai.a2a.rpc[].payload.response string Response body (only when log_payloads is enabled)

Task states

Task state values are normalized to lowercase A2A spec format regardless of the upstream SDK version: submitted, working, input-required, completed, canceled, failed, rejected, auth-required, unknown.

OpenTelemetry span attributes

When config.logging.log_statistics is enabled and Kong tracing is configured, the plugin creates a kong.a2a child span with the following attributes:

Attribute

Value Type

Description

kong.a2a.operation string A2A operation name
kong.a2a.protocol.version string Value of the A2A-Version request header, or unknown
kong.a2a.task.id string Task ID from the response
kong.a2a.task.state string Normalized task state
kong.a2a.context.id string A2A context ID
kong.a2a.error string Error type string when present
kong.a2a.streaming boolean true for SSE streaming responses
kong.a2a.ttfb_latency int Time to first byte in milliseconds (streaming only)
kong.a2a.sse_events_count int Count of SSE events (streaming only)
rpc.system string jsonrpc (JSON-RPC binding only)
rpc.method string A2A operation name (JSON-RPC binding only)

Request body size

The plugin reads the request body to detect JSON-RPC A2A requests. Use config.max_request_body_size to control the maximum body size parsed for detection (default 1 MB). Set to 0 for no limit. REST requests are detected by path and HTTP method without reading the body, so this setting applies to JSON-RPC detection only.

If a request body exceeds the limit, the plugin logs a warning and skips A2A detection for that request; the request is still proxied upstream.

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!