Release date 2025/12/10
Bugfix
-
Fixed intermittent 500 responses from the AI Proxy Advanced plugin when using Azure OpenAI
-
Fixed an issue where Files content analytics extraction is not handled properly for Azure.
Release date 2025/12/10
Fixed intermittent 500 responses from the AI Proxy Advanced plugin when using Azure OpenAI
Fixed an issue where Files content analytics extraction is not handled properly for Azure.
Release date 2025/11/18
Fixed an issue where the token count for Gemini Vertex embeddings API in native format was incorrect.
Fixed an issue where the token count for Gemini Vertex embeddings API in OpenAI format was incorrect.
Fixed an issue where the native format option did not work correctly for non-openai formats.
Fixed an issue where the semantic load balancing with pgvector namespace was not functioning correctly.
Fixed an issue where AWS Bedrock invoke command is not properly proxied.
Release date 2025/10/01
Added latency and cost observability to realtime API.
Added support for Gemini Rerank in native format.
Exposed input_tokens_details.cached_tokens_details.image_tokens_count in observability metrics.
Fixed an issue where the AI Proxy Advanced plugin’s lowest latency load balancer could fail to select a target when only one target was available, leading to 500 errors.
Fixed an issue where the ai-retry phase was not correctly setting the namespace in the kong.plugin.ctx, causing the ai-proxy-advanced balancer to retry the first target more than once.
Fixed an issue where the Gemini provider didn’t support Model Garden.
Fixed an issue where managed identity could be cached incorrectly.
Fixed an issue where Mistral models would return Unsupported field: seed when using some inference libraries.
Fixed an issue where array input wasn’t being properly validated for certain providers. Note that Bedrock, Gemini Public, and Mistral providers don’t support array input, as before.
Fixed an issue where some Titan embeddings models reported malformed requests.
Release date 2025/09/05
Support for Gemini Rerank in native format
Fix panic in LLM observability when populating token usage for AI Proxy token usage details
AI Proxy Advanced: Fixed an issue where gemini provider not support model garden
Fixed an issue where Mistral models would return Unsupported field: seed when using some inference libraries.
Skip unknown cached details key from o11y observation.
Fixed an issue where array input is not recognized as valid request for some providers. Bedrock, Gemini Public and Mistral provider don’t accept array input as before.
Fixed an issue where some Titan embeddings model reported malformed request.
Release date 2025/07/28
Fixed an issue where ai-retry phase was not correctly setting the namespace in the kong.plugin.ctx, causing ai-proxy-advanced balancer retry first target more than once.
Fixed an issue where AI Proxy and AI Proxy Advanced can’t properly failover to a Bedrock target.
Fixed an issue where AI Proxy and AI Proxy Advanced might produce duplicate content in the response when the SSE event was truncated.
Fixed an issue where AI Proxy and AI Proxy Advanced might drop content in the response when the SSE event was truncated.
Fixed an issue where managed identity may not be cached properly.
Release date 2025/07/16
Fixed an issue where the llm license migration could fail if the license counter contained more than one week of data.
Release date 2025/07/03
ai-proxy, ai-proxy-advanced: Deprecated the preserve route_type. You are encouraged to use new route_types added in version 3.11.x.x and onwards.
Add tried_targets field in serialized analytics logs for record of all tried ai targets.
Fixed an issue where some of ai metrics was missed in analytics
Fixed an issue where AI Proxy Advanced can’t failover from other provider to Bedrock.
Fix consistent-hashing algorithm not using correct value to hash.
Fixed an issue where the stale semantic key vector may not be refreshed after the plugin config is updated.
Fixed an issue where AI Proxy and AI Proxy Advanced would use corrupted plugin config.
If any AI Gateway plugin has been enabled in a self-managed Kong Gateway deployment for more than a week, upgrades from 3.10 versions to 3.11.0.0 will fail due to a license migration issue. This does not affect Konnect deployments.
A fix will be provided in 3.11.0.1.
See breaking changes in 3.11 for a temporary workaround.
Release date 2025/10/10
Fixed an issue where AI Proxy Advanced can’t failover from other provider to Bedrock.
Fixed model name escape or bedrock from request
Release date 2025/08/07
Fixed an issue where the Gemini provider could not use Anthropic ‘rawPredict’ endpoint models hosted in Vertex.
Fixed an issue where ai-retry phase was not correctly setting the namespace in the kong.plugin.ctx, causing ai-proxy-advanced balancer retry first target more than once.
Fixed an issue where AI Proxy and AI Proxy Advanced can’t properly failover to a Bedrock target.
Fix consistent-hashing algorithm not using correct value to hash.
Release date 2025/07/06
Fixed an issue where the stale semantic key vector may not be refreshed after the plugin config is updated.
Release date 2025/04/15
Fixed an issue where AI Proxy and AI Proxy Advanced would use corrupted plugin config.
Release date 2025/03/27
Changed the serialized log key of AI metrics from ai.ai-proxy to ai.proxy to avoid conflicts with metrics generated from plugins other than AI Proxy and AI Proxy Advanced. If you are using logging plugins (for example, File Log, HTTP Log, etc.), you will have to update metrics pipeline configurations to reflect this change.
Deprecated preserve mode in config.route_type. Use config.llm_format instead. The preserve mode setting will be removed in a future release.
Added support for boto3 SDKs for Bedrock provider, and for Google GenAI SDKs for Gemini provider.
Added new priority balancer algorithm, which allows setting apriority group for each upstream model.
Added the failover_criteria configuration option, which allows retrying requests to the next upstream server in case of failure.
Added cost to tokens_count_strategy when using the lowest-usage load balancing strategy.
Added the huggingface, azure, vertex, and bedrock providers to embeddings. They can be used by the ai-proxy-advanced, ai-semantic-cache, ai-semantic-prompt-guard, and ai-rag-injector plugins.
Allow authentication to Bedrock services with assume roles in AWS.
Added the ability to set a catch-all target in semantic routing.
Fixed an issue where AI upstream URL trailing would be empty.
Fixed an issue where the ai-proxy-advanced plugin failed to failover between providers of different formats.
Fixed an issue where the ai-proxy-advanced plugin identity running failed in retry scenarios.
Release date 2025/07/07
Fixed an issue where AI Proxy and AI Proxy Advanced would use corrupted plugin config.
Release date 2024/12/12
Added support for streaming responses to the AI Proxy Advanced plugin.
Made the
embeddings.model.name config field a free text entry, enabling use of a
self-hosted (or otherwise compatible) model.
Fixed an issue where stale plugin config was not updated in dbless and hybrid mode.
Fixed an issue where lowest-usage and lowest-latency strategy did not update data points correctly.