You can proxy requests to Azure AI models through AI Gateway using the AI Proxy and AI Proxy Advanced plugins. This reference documents all supported AI capabilities, configuration requirements, and provider-specific details needed for proper integration.
Azure OpenAI provider
Upstream paths
AI Gateway automatically routes requests to the appropriate Azure API endpoints. The following table shows the upstream paths used for each capability.
| Capability | Upstream path or API |
|---|---|
| Chat completions | /openai/deployments/{deployment_name}/chat/completions |
| Completions | /openai/deployments/{deployment_name}/completions |
| Embeddings | /openai/deployments/{deployment_name}/embeddings |
| Function calling | /openai/deployments/{deployment_name}/chat/completions |
| Files | /openai/files |
| Batches | /openai/batches |
| Assistants | /openai/assistants |
| Responses | /openai/v1/responses |
| Speech | /openai/audio/speech |
| Transcriptions | /openai/audio/transcriptions |
| Translations | /openai/audio/translations |
| Image generations | /openai/images/generations |
| Image edits | /openai/images/edits |
| Video generations | /openai/v1/video/generations/jobs |
| Realtime | /openai/realtime |
Supported capabilities
The following tables show the AI capabilities supported by Azure provider when used with the AI Proxy or the AI Proxy Advanced plugin.
Set the plugin’s
route_typebased on the capability you want to use. See the tables below for supported route types.
Text generation
Support for Azure basic text generation capabilities including chat, completions, and embeddings:
| Capability | Route type | Streaming | Model example | Min version |
|---|---|---|---|---|
| Chat completions | llm/v1/chat |
gpt-4o | 3.6 | |
| Completions | llm/v1/completions |
gpt-4o-mini | 3.6 | |
| Embeddings1 | llm/v1/embeddings |
text-embedding-3-small | 3.11 |
1 Use text-embedding-3-small or text-embedding-3-large for dynamic dimensions.
Advanced text generation
Support for Azure function calling to allow Azure models to use external tools and APIs:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Function calling | llm/v1/chat |
gpt-4o | 3.6 |
Processing
Support for Azure file operations, batch operations, assistants, and response handling:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Files | llm/v1/files |
n/a | 3.11 |
| Batches | llm/v1/batches |
n/a | 3.11 |
| Assistants2 | llm/v1/assistants |
n/a | 3.11 |
| Responses3 | llm/v1/responses |
n/a | 3.11 |
2 Assistans API requires header OpenAI-Beta: assistants=v2
3 Responses API requires config.azure_api_version set to "preview"
Audio
Support for Azure text-to-speech, transcription, and translation capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Speech | audio/v1/audio/speech |
n/a | 3.11 |
| Transcriptions | audio/v1/audio/transcriptions |
n/a | 3.11 |
| Translations | audio/v1/audio/translations |
n/a | 3.11 |
For requests with large payloads, consider increasing
config.max_request_body_sizeto three times the raw binary size.Supported audio formats, voices, and parameters vary by model. Refer to your provider’s documentation for available options.
Image
Support for Azure image generation and editing capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Generations | image/v1/images/generations |
n/a | 3.11 |
| Edits | image/v1/images/edits |
n/a | 3.11 |
For requests with large payloads, consider increasing
config.max_request_body_sizeto three times the raw binary size.Supported image sizes and formats vary by model. Refer to your provider’s documentation for allowed dimensions and requirements.
Video
Support for Azure video generation capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Generations | video/v1/videos/generations |
sora-2 | 3.13 |
For requests with large payloads (video generation), consider increasing
config.max_request_body_sizeto three times the raw binary size.
Realtime
Support for Azure’s bidirectional streaming for realtime applications:
Realtime processing requires the AI Proxy Advanced plugin and uses WebSocket protocol.
To use the realtime route, you must configure the protocols
wsand/orwsson both the Service and on the Route where the plugin is associated.
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Realtime4 | realtime/v1/realtime |
n/a | 3.11 |
4 For requests to Azure OpenAI realtime API, include include the header OpenAI-Beta: realtime=v1.
Azure base URL
The base URL is https://{azure_instance}.openai.azure.com:443/openai/deployments/{deployment_name}/{route_type_path}, where {route_type_path} is determined by the capability.
AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Azure-compatible endpoint, in which case set the upstream_url plugin option.
Configure Azure with AI Proxy
To use Azure with AI Gateway, configure the AI Proxy or AI Proxy Advanced.
Here’s a minimal configuration for chat completions:
For more configuration options and examples, see:
FAQs
Can I authenticate to Azure AI with Azure Identity?
Yes, if Kong Gateway is running on Azure, AI Proxy can detect the designated Managed Identity or User-Assigned Identity of that Azure Compute resource, and use it accordingly. In your AI Proxy configuration, set the following parameters:
-
config.auth.azure_use_managed_identitytotrueto use an Azure-Assigned Managed Identity. -
config.auth.azure_use_managed_identitytotrueand anconfig.auth.azure_client_idto use a User-Assigned Identity.