Send batch requests to Azure OpenAI LLMs
Package your prompts into a JSONL file and upload it to the /files endpoint. Then launch a batch job with /batches to process everything asynchronously, and download the output from /files once the run completes.
Prerequisites
Kong Konnect
This is a Konnect tutorial and requires a Konnect personal access token.
-
Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.
-
Export your token to an environment variable:
export KONNECT_TOKEN='YOUR_KONNECT_PAT'Copied! -
Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:
curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-outputCopied!This sets up a Konnect Control Plane named
quickstart, provisions a local Data Plane, and prints out the following environment variable exports:export DECK_KONNECT_TOKEN=$KONNECT_TOKEN export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com export KONNECT_PROXY_URL='http://localhost:8000'Copied!Copy and paste these into your terminal to configure your session.
Kong Gateway running
This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.
-
Export your license to an environment variable:
export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'Copied! -
Run the quickstart script:
curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATACopied!Once Kong Gateway is ready, you will see the following message:
Kong Gateway Ready
decK v1.43+
decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.
This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance.
We recommend upgrading your decK installation to take advantage of this tool.
You can check your current decK version with deck version.
Required entities
For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:
-
Run the following command:
echo ' _format_version: "3.0" services: - name: files-service url: http://httpbin.konghq.com/files - name: batches-service url: http://httpbin.konghq.com/batches routes: - name: files-route paths: - "/files" service: name: files-service - name: batches-route paths: - "/batches" service: name: batches-service ' | deck gateway apply -Copied!
To learn more about entities, you can read our entities documentation.
Azure OpenAI
This tutorial uses Azure OpenAI service. Configure it as follows:
- Create an Azure account.
- In the Azure Portal, click Create a resource.
- Search for Azure OpenAI and select Azure OpenAI Service.
- Configure your Azure resource.
- Export your instance name:
export DECK_AZURE_INSTANCE_NAME='YOUR_AZURE_RESOURCE_NAME'Copied! - Deploy your model in Azure AI Foundry:
- Go to My assets → Models and deployments → Deploy model.
Use a
globalbatchordatazonebatchdeployment type for batch operations since standard deployments (GlobalStandard) cannot process batch files.- Export the API key and deployment ID:
bash export DECK_AZURE_OPENAI_API_KEY='YOUR_AZURE_OPENAI_MODEL_API_KEY' export DECK_AZURE_DEPLOYMENT_ID='YOUR_AZURE_OPENAI_DEPLOYMENT_NAME'
Batch .jsonl file
To complete this tutorial, create a batch.jsonl to generate asynchronous batched LLM responses. We use /v1/chat/completions because it handles chat-based generation requests, instructing the LLM to produce conversational completions in batch mode.
Run the following command to create the file:
cat <<EOF > batch.jsonl
{"custom_id": "prod1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a compelling product description for a solar-powered smart garden light."}], "max_tokens": 60}}
{"custom_id": "prod2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a product description for an energy-efficient smart thermostat for home use."}], "max_tokens": 60}}
{"custom_id": "prod3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write an engaging product description for a biodegradable bamboo kitchen utensil set."}], "max_tokens": 60}}
{"custom_id": "prod4", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a detailed product description for a water-saving smart shower head."}], "max_tokens": 60}}
{"custom_id": "prod5", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a concise product description for a compact indoor air purifier that uses natural filters."}], "max_tokens": 60}}
EOF
Configure AI Proxy plugins for /files route
Let’s create an AI Proxy plugin for the llm/v1/files route type. It will be used to handle the upload and retrieval of JSONL files containing batch input and output data. This plugin instance ensures that input data is correctly staged for batch processing and that the results can be downloaded once the batch job completes.
echo '
_format_version: "3.0"
plugins:
- name: ai-proxy
service: files-service
config:
model_name_header: false
route_type: llm/v1/files
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_AZURE_OPENAI_API_KEY" }}
model:
provider: azure
options:
azure_api_version: 2025-01-01-preview
azure_instance: "${{ env "DECK_AZURE_INSTANCE_NAME" }}"
azure_deployment_id: "${{ env "DECK_AZURE_DEPLOYMENT_ID" }}"
' | deck gateway apply -
Configure AI Proxy plugins for /batches route
Next, create an AI Proxy plugin for the llm/v1/batches route. This plugin manages the submission, monitoring, and retrieval of asynchronous batch jobs. It communicates with Azure OpenAI’s batch deployment to process multiple LLM requests in a batch.
echo '
_format_version: "3.0"
plugins:
- name: ai-proxy
service: batches-service
config:
model_name_header: false
route_type: llm/v1/batches
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_AZURE_OPENAI_API_KEY" }}
model:
provider: azure
options:
azure_api_version: 2025-01-01-preview
azure_instance: "${{ env "DECK_AZURE_INSTANCE_NAME" }}"
azure_deployment_id: "${{ env "DECK_AZURE_DEPLOYMENT_ID" }}"
' | deck gateway apply -
Upload a .jsonl file for batching
Now, let’s use the following command to upload our batching file to the /llm/v1/files route:
curl "$KONNECT_PROXY_URL/files" \
-F purpose="batch" \
-F file="@batch.jsonl"
curl "http://localhost:8000/files" \
-F purpose="batch" \
-F file="@batch.jsonl"
Once processed, you will see a JSON response like this:
{
"status": "processed",
"bytes": 1648,
"purpose": "batch",
"filename": "batch.jsonl",
"id": "file-da4364d8fd714dd9b29706b91236ab02",
"created_at": 1761817541,
"object": "file"
}
Now, let’s export the file ID:
export FILE_ID=YOUR_FILE_ID
Create a batching request
Now, we can send a POST request to the /batches Route to create a batch using our uploaded file:
The completion window must be set to
24h, as it’s the only value currently supported by the OpenAI/batchesAPI.In this example we use the
/v1/chat/completionsroute for batching because we are sending multiple structured chat-style prompts in OpenAI’s chat completions format to be processed in bulk.
curl "$KONNECT_PROXY_URL/batches" \
--json '{
"input_file_id": "'$FILE_ID'",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'
curl "http://localhost:8000/batches" \
--json '{
"input_file_id": "'$FILE_ID'",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'
You will receive a response similar to:
{
"cancelled_at": null,
"cancelling_at": null,
"completed_at": null,
"completion_window": "24h",
"created_at": 1761817562,
"error_file_id": "",
"expired_at": null,
"expires_at": 1761903959,
"failed_at": null,
"finalizing_at": null,
"id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
"in_progress_at": null,
"input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
"errors": null,
"metadata": null,
"object": "batch",
"output_file_id": "",
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
},
"status": "validating",
"endpoint": ""
}
Copy the batch ID from this response to check the batch status and export it as an environment variable by running the following command in your terminal:
export BATCH_ID=YOUR_BATCH_ID
Check batching status
Wait for a moment for the batching request to be completed, then check the status of your batch by sending the following request:
curl "$KONNECT_PROXY_URL/batches/$BATCH_ID"
curl "http://localhost:8000/batches/$BATCH_ID"
A completed batch response looks like this:
{
"cancelled_at": null,
"cancelling_at": null,
"completed_at": 1761817685,
"completion_window": "24h",
"created_at": 1761817562,
"error_file_id": null,
"expired_at": null,
"expires_at": 1761903959,
"failed_at": null,
"finalizing_at": 1761817662,
"id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
"in_progress_at": null,
"input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
"errors": null,
"metadata": null,
"object": "batch",
"output_file_id": "file-93d91f55-0418-abcd-1234-81f4bb334951",
"request_counts": {
"total": 5,
"completed": 5,
"failed": 0
},
"status": "completed",
"endpoint": "/v1/chat/completions"
}
You can notice The "request_counts" object shows that all five requests in the batch were successfully completed ("completed": 5, "failed": 0).
Now, you can copy the output_file_id to retrieve your batched responses and export it as environment variable:
export OUTPUT_FILE_ID=YOUR_OUTPUT_FILE_ID
The output file ID will only be available once the batch request has completed. If the status is "in_progress", it won’t be set yet.
Retrieve batched responses
Now, we can download the batched responses from the /files endpoint by appending /content to the file ID URL. For details, see the OpenAI API documentation.
curl http://localhost:8000/files/$OUTPUT_FILE_ID/content > batched-response.jsonl
This command saves the batched responses to the batched-response.jsonl file.
The batched response file contains one JSON object per line, each representing a single batched request’s response. Here is an example of content from batched-response.jsonl which contains the individual completion results for each request we submitted in the batch input file:
{"custom_id": "prod4", "response": {"body": {"id": "chatcmpl-AB12CD34EF56GH78IJ90KL12MN", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoFlow Smart Shower Head: Revolutionize Your Daily Routine While Saving Water**\n\nExperience the perfect blend of luxury, sustainability, and smart technology with the **EcoFlow Smart Shower Head** — a cutting-edge solution for modern households looking to conserve water without compromising on comfort. Designed to elevate your shower experience", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-111aaa22-bb33-cc44-dd55-ee66ff778899", "status_code": 200}, "error": null}
{"custom_id": "prod3", "response": {"body": {"id": "chatcmpl-ZX98YW76VU54TS32RQ10PO98LK", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Eco-Friendly Elegance: Biodegradable Bamboo Kitchen Utensil Set**\n\nElevate your cooking experience while making a positive impact on the planet with our **Biodegradable Bamboo Kitchen Utensil Set**. Crafted from 100% natural, sustainably sourced bamboo, this set combines durability, functionality", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-222bbb33-cc44-dd55-ee66-ff7788990011", "status_code": 200}, "error": null}
{"custom_id": "prod1", "response": {"body": {"id": "chatcmpl-MN34OP56QR78ST90UV12WX34YZ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Illuminate Your Garden with Brilliance: The Solar-Powered Smart Garden Light** \n\nTransform your outdoor space into a haven of sustainable beauty with the **Solar-Powered Smart Garden Light**—a perfect blend of modern innovation and eco-friendly design. Powered entirely by the sun, this smart light delivers effortless", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-333ccc44-dd55-ee66-ff77-889900112233", "status_code": 200}, "error": null}
{"custom_id": "prod5", "response": {"body": {"id": "chatcmpl-AQ12WS34ED56RF78TG90HY12UJ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "Breathe easy with our compact indoor air purifier, designed to deliver fresh and clean air using natural filters. This eco-friendly purifier quietly removes allergens, dust, and odors without synthetic materials, making it perfect for any small space. Stylish, efficient, and sustainable—experience pure air, naturally.", "refusal": null, "annotations": []}, "finish_reason": "stop", "logprobs": null}], "usage": {"completion_tokens": 59, "prompt_tokens": 33, "total_tokens": 92}, "system_fingerprint": "fp_random1234"},"request_id": "req-444ddd55-ee66-ff77-8899-001122334455", "status_code": 200}, "error": null}
{"custom_id": "prod2", "response": {"body": {"id": "chatcmpl-PO98LK76JI54HG32FE10DC98VB", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoSmart Pro Wi-Fi Thermostat: Energy Efficiency Meets Smart Technology** \n\nUpgrade your home’s comfort and save energy with the EcoSmart Pro Wi-Fi Thermostat. Designed for modern living, this sleek and intuitive thermostat lets you take control of your heating and cooling while minimizing energy waste. Whether you're", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-555eee66-ff77-8899-0011-223344556677", "status_code": 200}, "error": null}
Cleanup
Clean up Konnect environment
If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.
Destroy the Kong Gateway container
curl -Ls https://get.konghq.com/quickstart | bash -s -- -d