Route Azure AI SDK requests to Azure OpenAI deployments

Deployment Platform
Minimum Version
Kong Gateway - 3.6
TL;DR

Create a Route with a regex path that captures the deployment name, then use the $(uri_captures) template variable in AI Proxy Advanced to set the Azure deployment ID dynamically.

Prerequisites

This is a Konnect tutorial and requires a Konnect personal access token.

  1. Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.

  2. Export your token to an environment variable:

     export KONNECT_TOKEN='YOUR_KONNECT_PAT'
    
  3. Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:

     curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
    

    This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:

     export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
     export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
     export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
     export KONNECT_PROXY_URL='http://localhost:8000'
    

    Copy and paste these into your terminal to configure your session.

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

  1. Export your license to an environment variable:

     export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'
    
  2. Run the quickstart script:

    curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA 
    

    Once Kong Gateway is ready, you will see the following message:

     Kong Gateway Ready
    

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.

This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance. We recommend upgrading your decK installation to take advantage of this tool.

You can check your current decK version with deck version.

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

  1. Run the following command:

    echo '
    _format_version: "3.0"
    services:
      - name: azure-openai-service
        url: http://localhost:8000
    routes:
      - name: azure-chat-route
        paths:
        - "~/openai/deployments/(?<azure_deployment>[^/]+)/chat/completions$"
        methods:
        - POST
        service:
          name: azure-openai-service
    ' | deck gateway apply -
    

To learn more about entities, you can read our entities documentation.

This tutorial uses Azure OpenAI service. Use the following steps to configure it:

  1. Create an Azure account.
  2. In the Azure Portal, click Create a resource.
    1. Search for Azure OpenAI and select Azure OpenAI Service.
    2. Configure your Azure resource.
    3. Once created, export the following environment variable:
       export DECK_AZURE_INSTANCE_NAME='YOUR AZURE RESOURCE NAME'
      
  3. Once you’ve created your Azure resource, go to Azure AI foundry and do the following:
    1. In the My assets subgroup in the main sidebar, click Models and deployments and click Deploy model.
    2. Once deployed, export the following environment variables:
       export DECK_AZURE_OPENAI_API_KEY='YOUR AZURE OPENAI MODEL API KEY'
       export DECK_AZURE_DEPLOYMENT_ID='YOUR AZURE OPENAI DEPLOYMENT NAME'
      

To complete this tutorial, you’ll need Python (version 3.7 or later) and pip installed on your machine. You can verify it by running:

python3
python3 -m pip --version
  1. Create a virtual env:

    python3 -m venv myenv
    
  2. Activate it:

    source myenv/bin/activate
    

Install the OpenAI SDK:

pip install openai

The Azure OpenAI SDK can connect to Azure OpenAI Service through AI Gateway. With Azure, the model parameter in SDK calls maps to a deployment name on your Azure instance. The SDK constructs request URLs in the format https://{azure_instance}.openai.azure.com/openai/deployments/{azure_deployment_id}/chat/completions. When the SDK sends a request to /openai/deployments/gpt-4o/chat/completions, the Route captures gpt-4o into the azure_deployment named group.

Instead of creating a separate Route for each deployment, you can configure a single Route with a regex path that captures the deployment name from the URL. AI Proxy Advanced reads the captured value through a template variable and uses it as the Azure deployment ID.

Configure the AI Proxy Advanced plugin

First, let’s configure AI Proxy Advanced to read the deployment name from the captured path segment. The $(uri_captures.azure_deployment) template variable resolves at request time:

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    route: azure-chat-route
    config:
      targets:
      - route_type: llm/v1/chat
        auth:
          header_name: api-key
          header_value: "${{ env "DECK_AZURE_OPENAI_API_KEY" }}"
        model:
          provider: azure
          name: "$(uri_captures.azure_deployment)"
          options:
            azure_instance: "${{ env "DECK_AZURE_INSTANCE_NAME" }}"
            azure_deployment_id: "$(uri_captures.azure_deployment)"
' | deck gateway apply -

Validate

Now, let’s create a test script that sends requests to different Azure deployments through the same Kong Gateway Route. The AzureOpenAI client constructs URLs with /openai/deployments/{model}/chat/completions, which matches the Route regex. The model parameter determines which deployment receives the request:

cat <<EOF > test_azure_deployments.py
from openai import AzureOpenAI

client = AzureOpenAI(
    api_key="test",
    azure_endpoint="http://localhost:8000",
    api_version="2025-01-01-preview"
)

for model in ["gpt-4o", "gpt-4.1-mini"]:
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "What model are you? Reply with only your model name."}]
    )
    print(f"Requested: {model}, Got: {response.model}")
EOF
cat <<EOF > test_azure_deployments.py
from openai import AzureOpenAI
import os

client = AzureOpenAI(
    api_key="test",
    azure_endpoint=os.environ['KONNECT_PROXY_URL'],
    api_version="2025-01-01-preview"
)

for model in ["gpt-4o", "gpt-4.1-mini"]:
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "What model are you? Reply with only your model name."}]
    )
    print(f"Requested: {model}, Got: {response.model}")
EOF

Run the script:

python test_azure_deployments.py

You should see each request routed to the corresponding Azure deployment, confirming that a single Kong Gateway Route handles multiple deployments dynamically:

Requested: gpt-4o, Got: gpt-4o-2024-11-20
Requested: gpt-4.1-mini, Got: gpt-4.1-mini-2025-04-14

Cleanup

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d
Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!