Related Documentation
Made by
Kong Inc.
Supported Gateway Topologies
hybrid db-less traditional
Supported Konnect Deployments
hybrid cloud-gateways serverless
Compatible Protocols
grpc grpcs http https
Minimum Version
Kong Gateway - 3.6
Tags
#ai

3.10.0.1

Release date 2025/04/15

Bugfix

  • Fixed an issue where AI Proxy and AI Proxy Advanced would use corrupted plugin config.

3.10.0.0

Release date 2025/03/27

Breaking Change

  • Changed the serialized log key of AI metrics from ai.ai-proxy to ai.proxy to avoid conflicts with metrics generated from plugins other than AI Proxy and AI Proxy Advanced. If you are using logging plugins (for example, File Log, HTTP Log, etc.), you will have to update metrics pipeline configurations to reflect this change.

Deprecation

  • Deprecated preserve mode in config.route_type. Use config.llm_format instead. The preserve mode setting will be removed in a future release.

Feature

  • Added support for boto3 SDKs for Bedrock provider, and for Google GenAI SDKs for Gemini provider.

  • Allow authentication to Bedrock services with assume roles in AWS.

Bugfix

  • Fixed a bug in the Azure provider where model.options.upstream_path overrides would always return a 404 response.

  • Fixed a bug where Azure streaming responses would be missing individual tokens.

  • Fixed a bug where response streaming in Gemini and Bedrock providers was returning whole chat responses in one chunk.

  • Fixed a bug with the Gemini provider, where multimodal requests (in OpenAI format) would not transform properly.

  • Fixed an issue where AI upstream URL trailing would be empty.

  • Fixed an issue where Gemini streaming responses were getting truncated and/or missing tokens.

  • Fixed an incorrect error thrown when trying to log streaming responses.

    • Fixed an issue where templates weren’t being resolved correctly.
    • The plugins now support nested fields.
  • Fixed tool calls not working in streaming mode for Bedrock and Gemini providers.

  • Fixed preserve mode.

Known Issues

  • Some active tracing latency values are incorrectly reported as having zero length when using the AI Proxy plugin.

3.9.1.1

Release date 2025/03/20

Bugfix

  • Fixed issue of template not being resolved correctly and supported nested fields.

  • Fixed preserve mode.

3.9.1.0

Release date 2025/03/11

Bugfix

  • Fixed Gemini streaming responses getting truncated and/or missing tokens.

  • Fixed incorrect error thrown when trying to log streaming responses.

  • Fixed tool calls not working in streaming mode for Bedrock and Gemini providers.

3.9.0.1

Release date 2025/01/28

Bugfix

  • Fixed a bug in the Azure provider where model.options.upstream_path overrides would always return 404.

  • Fixed a bug where Azure streaming responses would be missing individual tokens.

  • Fixed a bug where response streaming in Gemini and Bedrock providers was returning whole chat responses in one chunk.

  • Fixed a bug where multimodal requests (in OpenAI format) would not transform properly, when using the Gemini provider.

  • Reverted the analytics container key from “proxy” to “ai-proxy” to align with previous versions.

3.9.0.0

Release date 2024/12/12

Feature

  • Disabled HTTP/2 ALPN handshake for connections on routes configured with AI-proxy.

Bugfix

  • Fixed a bug where tools (function) calls to Anthropic would return empty results.

  • Fixed a bug where tools (function) calls to Bedrock would return empty results.

  • Fixed a bug where Bedrock Guardrail config was ignored.

  • Fixed a bug where tools (function) calls to Cohere would return empty results.

  • Fixed a bug where Gemini provider would return an error if content safety failed in AI Proxy.

  • Fixed a bug where tools (function) calls to Gemini (or via Vertex) would return empty results.

  • Fixed an issue where AI Transformer plugins always returned a 404 error when using ‘Google One’ Gemini subscriptions.

  • Fixed issue where multi-modal requests is blocked on azure provider.

3.8.1.0

Release date 2024/11/04

Bugfix

  • Fixed an issue where AI Transformer plugins always returned a 404 error when using ‘Google One’ Gemini subscriptions.

  • Fixed issue where multi-modal requests is blocked on azure provider.

3.8.0.0

Release date 2024/09/11

Feature

  • AI plugins: retrieved latency data and pushed it to logs and metrics.

  • allow AI plugin to read request from buffered file

  • Add allow_override option to allow overriding the upstream model auth parameter or header from the caller’s request.

  • Kong AI Gateway (AI Proxy and associated plugin family) now supports all AWS Bedrock “Converse API” models.

  • Kong AI Gateway (AI Proxy and associated plugin family) now supports the Google Gemini “chat” (generateContent) interface.

  • Allowed mistral provider to use mistral.ai managed service by omitting upstream_url

  • Added a new response header X-Kong-LLM-Model that displays the name of the language model used in the AI-Proxy plugin.

Bugfix

  • Fixed a bug where certain Azure models would return partial tokens/words when in response-streaming mode.

  • Fixed a bug where Cohere and Anthropic providers don’t read the model parameter properly from the caller’s request body.

  • Fixed a bug where using “OpenAI Function” inference requests would log a request error, and then hang until timeout.

  • Fixed a bug where AI Proxy would still allow callers to specify their own model,
    ignoring the plugin-configured model name.

  • Fixed a bug where AI Proxy would not take precedence of the plugin’s configured model tuning options, over those in the user’s LLM request.

  • Fixed a bug where setting OpenAI SDK model parameter “null” caused analytics to not be written to the logging plugin(s).

  • Fixed issue when response is gzipped even if client doesn’t accept.

  • Fixed certain AI plugins cannot be applied per consumer or per service.

  • Resolved a bug where the object constructor would set data on the class instead of the instance

  • Fixed an issue for multi-modal inputs are not properly validated and calculated.

  • Fixed a bug where Azure Managed-Identity tokens would never rotate
    in case of a network failure when authenticating.

3.7.1.3

Release date 2024/11/26

Bugfix

  • Fixed a bug where certain Azure models would return partial tokens/words when in response-streaming mode.

  • Fixed a bug where Cohere and Anthropic providers don’t read the model parameter properly from the caller’s request body.

  • Fixed a bug where using “OpenAI Function” inference requests would log a request error, and then hang until timeout.

  • Fixed a bug where AI Proxy would still allow callers to specify their own model,
    ignoring the plugin-configured model name.

  • Fixed a bug where AI Proxy would not take precedence of the plugin’s configured model tuning options, over those in the user’s LLM request.

  • Fixed a bug where setting OpenAI SDK model parameter “null” caused analytics to not be written to the logging plugin(s).

3.7.1.0

Release date 2024/06/18

Bugfix

  • Resolved a bug where the object constructor would set data on the class instead of the instance

3.7.0.0

Release date 2024/05/28

Breaking Change

  • To support the new messages API of Anthropic, the upstream path of the Anthropic for llm/v1/chat route type has changed from /v1/complete to /v1/messages.

Feature

  • AI Proxy now reads most prompt tuning parameters from the client, while the plugin config parameters under model_options are now just defaults. This fixes support for using the respective provider’s native SDK.

  • AI Proxy now has a preserve option for route_type, where the requests and responses are passed directly to the upstream LLM. This is to enable compatibility with any and all models and SDKs that may be used when calling the AI services.

  • Added support for streaming event-by-event responses back to the client on supported providers.

  • Added support for Managed Identity authentication when using the Azure provider with AI Proxy.

Bugfix

  • Fixed the bug that the route_type /llm/v1/chat didn’t include the analytics in the responses.

3.6.0.0

Release date 2024/02/12

Feature

  • Introduced the new AI Proxy plugin that enables simplified integration with various AI provider Large Language Models.

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!