Now, let’s create a test script that sends requests to different Azure deployments through the same Kong Gateway Route. The AzureOpenAI client constructs URLs with /openai/deployments/{model}/chat/completions, which matches the Route regex. The model parameter determines which deployment receives the request:
cat <<EOF > test_azure_deployments.py
from openai import AzureOpenAI
client = AzureOpenAI(
api_key="test",
azure_endpoint="http://localhost:8000",
api_version="2025-01-01-preview"
)
for model in ["gpt-4o", "gpt-4.1-mini"]:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "What model are you? Reply with only your model name."}]
)
print(f"Requested: {model}, Got: {response.model}")
EOF
cat <<EOF > test_azure_deployments.py
from openai import AzureOpenAI
import os
client = AzureOpenAI(
api_key="test",
azure_endpoint=os.environ['KONNECT_PROXY_URL'],
api_version="2025-01-01-preview"
)
for model in ["gpt-4o", "gpt-4.1-mini"]:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "What model are you? Reply with only your model name."}]
)
print(f"Requested: {model}, Got: {response.model}")
EOF
Run the script:
python test_azure_deployments.py
You should see each request routed to the corresponding Azure deployment, confirming that a single Kong Gateway Route handles multiple deployments dynamically:
Requested: gpt-4o, Got: gpt-4o-2024-11-20
Requested: gpt-4.1-mini, Got: gpt-4.1-mini-2025-04-14