How to: Send batch requests to Azure OpenAI LLMs

Prerequisites

Kong Konnect

This is a Konnect tutorial and requires a Konnect personal access token.

Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.
Export your token to an environment variable:
```
 export KONNECT_TOKEN='YOUR_KONNECT_PAT'
```
Copied!
Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:
```
 curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
```
Copied!
This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:
```
 export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
 export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
 export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
 export KONNECT_PROXY_URL='http://localhost:8000'
```
Copied!
Copy and paste these into your terminal to configure your session.

Kong Gateway running

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

Export your license to an environment variable:

 export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'

Copied!

Run the quickstart script:

curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA

Copied!

Once Kong Gateway is ready, you will see the following message:

 Kong Gateway Ready

decK v1.43+

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.

This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance. We recommend upgrading your decK installation to take advantage of this tool.

You can check your current decK version with deck version.

Required entities

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

Run the following command:

echo '
_format_version: "3.0"
services:
  - name: files-service
    url: http://httpbin.konghq.com/files
  - name: batches-service
    url: http://httpbin.konghq.com/batches
routes:
  - name: files-route
    paths:
    - "/files"
    service:
      name: files-service
  - name: batches-route
    paths:
    - "/batches"
    service:
      name: batches-service
' | deck gateway apply -

Copied!

To learn more about entities, you can read our entities documentation.

Azure OpenAI

This tutorial uses Azure OpenAI service. Configure it as follows:

Create an Azure account.
In the Azure Portal, click Create a resource.
Search for Azure OpenAI and select Azure OpenAI Service.
Configure your Azure resource.

Export your instance name:

export DECK_AZURE_INSTANCE_NAME='YOUR_AZURE_RESOURCE_NAME'

Copied!

Deploy your model in Azure AI Foundry:
1. Go to My assets → Models and deployments → Deploy model.
Use a globalbatch or datazonebatch deployment type for batch operations since standard deployments (GlobalStandard) cannot process batch files.
1. Export the API key and deployment ID:
  export DECK_AZURE_OPENAI_API_KEY='YOUR_AZURE_OPENAI_MODEL_API_KEY' export DECK_AZURE_DEPLOYMENT_ID='YOUR_AZURE_OPENAI_DEPLOYMENT_NAME'
  Copied!

Batch .jsonl file

To complete this tutorial, create a batch.jsonl to generate asynchronous batched LLM responses. We use /v1/chat/completions because it handles chat-based generation requests, instructing the LLM to produce conversational completions in batch mode.

Run the following command to create the file:

cat <<EOF > batch.jsonl
{"custom_id": "prod1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a compelling product description for a solar-powered smart garden light."}], "max_tokens": 60}}
{"custom_id": "prod2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a product description for an energy-efficient smart thermostat for home use."}], "max_tokens": 60}}
{"custom_id": "prod3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write an engaging product description for a biodegradable bamboo kitchen utensil set."}], "max_tokens": 60}}
{"custom_id": "prod4", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a detailed product description for a water-saving smart shower head."}], "max_tokens": 60}}
{"custom_id": "prod5", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a concise product description for a compact indoor air purifier that uses natural filters."}], "max_tokens": 60}}
EOF

Copied!

Configure AI Proxy plugins for /files route

Let’s create an AI Proxy plugin for the llm/v1/files route type. It will be used to handle the upload and retrieval of JSONL files containing batch input and output data. This plugin instance ensures that input data is correctly staged for batch processing and that the results can be downloaded once the batch job completes.

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    service: files-service
    config:
      model_name_header: false
      route_type: llm/v1/files
      auth:
        header_name: Authorization
        header_value: Bearer ${{ env "DECK_AZURE_OPENAI_API_KEY" }}
      model:
        provider: azure
        options:
          azure_api_version: 2025-01-01-preview
          azure_instance: "${{ env "DECK_AZURE_INSTANCE_NAME" }}"
          azure_deployment_id: "${{ env "DECK_AZURE_DEPLOYMENT_ID" }}"
' | deck gateway apply -

Copied!

Configure AI Proxy plugins for /batches route

Next, create an AI Proxy plugin for the llm/v1/batches route. This plugin manages the submission, monitoring, and retrieval of asynchronous batch jobs. It communicates with Azure OpenAI’s batch deployment to process multiple LLM requests in a batch.

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    service: batches-service
    config:
      model_name_header: false
      route_type: llm/v1/batches
      auth:
        header_name: Authorization
        header_value: Bearer ${{ env "DECK_AZURE_OPENAI_API_KEY" }}
      model:
        provider: azure
        options:
          azure_api_version: 2025-01-01-preview
          azure_instance: "${{ env "DECK_AZURE_INSTANCE_NAME" }}"
          azure_deployment_id: "${{ env "DECK_AZURE_DEPLOYMENT_ID" }}"
' | deck gateway apply -

Copied!

Upload a .jsonl file for batching

Now, let’s use the following command to upload our batching file to the /llm/v1/files route:

curl -X POST "$KONNECT_PROXY_URL/files" \
     --no-progress-meter --fail-with-body  \
     -F purpose="batch" \
     -F file="@batch.jsonl"

Copied!

curl -X POST "http://localhost:8000/files" \
     --no-progress-meter --fail-with-body  \
     -F purpose="batch" \
     -F file="@batch.jsonl"

Copied!

Once processed, you will see a JSON response like this:

{
  "status": "processed",
  "bytes": 1648,
  "purpose": "batch",
  "filename": "batch.jsonl",
  "id": "file-da4364d8fd714dd9b29706b91236ab02",
  "created_at": 1761817541,
  "object": "file"
}

Copied!

Now, let’s export the file ID:

export FILE_ID=YOUR_FILE_ID

Copied!

Create a batching request

Now, we can send a POST request to the /batches Route to create a batch using our uploaded file:

The completion window must be set to 24h, as it’s the only value currently supported by the OpenAI /batches API.

In this example we use the /v1/chat/completions route for batching because we are sending multiple structured chat-style prompts in OpenAI’s chat completions format to be processed in bulk.

curl -X POST "$KONNECT_PROXY_URL/batches" \
     --no-progress-meter --fail-with-body  \
     --json '{
       "input_file_id": "'$FILE_ID'",
       "endpoint": "/v1/chat/completions",
       "completion_window": "24h"
     }'

Copied!

curl -X POST "http://localhost:8000/batches" \
     --no-progress-meter --fail-with-body  \
     --json '{
       "input_file_id": "'$FILE_ID'",
       "endpoint": "/v1/chat/completions",
       "completion_window": "24h"
     }'

Copied!

You will receive a response similar to:

{
  "cancelled_at": null,
  "cancelling_at": null,
  "completed_at": null,
  "completion_window": "24h",
  "created_at": 1761817562,
  "error_file_id": "",
  "expired_at": null,
  "expires_at": 1761903959,
  "failed_at": null,
  "finalizing_at": null,
  "id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
  "in_progress_at": null,
  "input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
  "errors": null,
  "metadata": null,
  "object": "batch",
  "output_file_id": "",
  "request_counts": {
    "total": 0,
    "completed": 0,
    "failed": 0
  },
  "status": "validating",
  "endpoint": ""
}

Copy the batch ID from this response to check the batch status and export it as an environment variable by running the following command in your terminal:

export BATCH_ID=YOUR_BATCH_ID

Copied!

Check batching status

Wait for a moment for the batching request to be completed, then check the status of your batch by sending the following request:

curl "$KONNECT_PROXY_URL/batches/$BATCH_ID" \
     --no-progress-meter --fail-with-body

Copied!

curl "http://localhost:8000/batches/$BATCH_ID" \
     --no-progress-meter --fail-with-body

Copied!

A completed batch response looks like this:

{
  "cancelled_at": null,
  "cancelling_at": null,
  "completed_at": 1761817685,
  "completion_window": "24h",
  "created_at": 1761817562,
  "error_file_id": null,
  "expired_at": null,
  "expires_at": 1761903959,
  "failed_at": null,
  "finalizing_at": 1761817662,
  "id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
  "in_progress_at": null,
  "input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
  "errors": null,
  "metadata": null,
  "object": "batch",
  "output_file_id": "file-93d91f55-0418-abcd-1234-81f4bb334951",
  "request_counts": {
    "total": 5,
    "completed": 5,
    "failed": 0
  },
  "status": "completed",
  "endpoint": "/v1/chat/completions"
}

You can notice The "request_counts" object shows that all five requests in the batch were successfully completed ("completed": 5, "failed": 0).

Now, you can copy the output_file_id to retrieve your batched responses and export it as environment variable:

export OUTPUT_FILE_ID=YOUR_OUTPUT_FILE_ID

Copied!

The output file ID will only be available once the batch request has completed. If the status is "in_progress", it won’t be set yet.

Retrieve batched responses

Now, we can download the batched responses from the /files endpoint by appending /content to the file ID URL. For details, see the OpenAI API documentation.

curl "$KONNECT_PROXY_URL/files/$OUTPUT_FILE_ID/content" \
     -o batched-response.jsonl --no-progress-meter --fail-with-body

Copied!

curl "http://localhost:8000/files/$OUTPUT_FILE_ID/content" \
     -o batched-response.jsonl --no-progress-meter --fail-with-body

Copied!

This command saves the batched responses to the batched-response.jsonl file.

The batched response file contains one JSON object per line, each representing a single batched request’s response. Here is an example of content from batched-response.jsonl which contains the individual completion results for each request we submitted in the batch input file:

{"custom_id": "prod4", "response": {"body": {"id": "chatcmpl-AB12CD34EF56GH78IJ90KL12MN", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoFlow Smart Shower Head: Revolutionize Your Daily Routine While Saving Water**\n\nExperience the perfect blend of luxury, sustainability, and smart technology with the **EcoFlow Smart Shower Head** — a cutting-edge solution for modern households looking to conserve water without compromising on comfort. Designed to elevate your shower experience", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-111aaa22-bb33-cc44-dd55-ee66ff778899", "status_code": 200}, "error": null}
{"custom_id": "prod3", "response": {"body": {"id": "chatcmpl-ZX98YW76VU54TS32RQ10PO98LK", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Eco-Friendly Elegance: Biodegradable Bamboo Kitchen Utensil Set**\n\nElevate your cooking experience while making a positive impact on the planet with our **Biodegradable Bamboo Kitchen Utensil Set**. Crafted from 100% natural, sustainably sourced bamboo, this set combines durability, functionality", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-222bbb33-cc44-dd55-ee66-ff7788990011", "status_code": 200}, "error": null}
{"custom_id": "prod1", "response": {"body": {"id": "chatcmpl-MN34OP56QR78ST90UV12WX34YZ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Illuminate Your Garden with Brilliance: The Solar-Powered Smart Garden Light**  \n\nTransform your outdoor space into a haven of sustainable beauty with the **Solar-Powered Smart Garden Light**—a perfect blend of modern innovation and eco-friendly design. Powered entirely by the sun, this smart light delivers effortless", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-333ccc44-dd55-ee66-ff77-889900112233", "status_code": 200}, "error": null}
{"custom_id": "prod5", "response": {"body": {"id": "chatcmpl-AQ12WS34ED56RF78TG90HY12UJ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "Breathe easy with our compact indoor air purifier, designed to deliver fresh and clean air using natural filters. This eco-friendly purifier quietly removes allergens, dust, and odors without synthetic materials, making it perfect for any small space. Stylish, efficient, and sustainable—experience pure air, naturally.", "refusal": null, "annotations": []}, "finish_reason": "stop", "logprobs": null}], "usage": {"completion_tokens": 59, "prompt_tokens": 33, "total_tokens": 92}, "system_fingerprint": "fp_random1234"},"request_id": "req-444ddd55-ee66-ff77-8899-001122334455", "status_code": 200}, "error": null}
{"custom_id": "prod2", "response": {"body": {"id": "chatcmpl-PO98LK76JI54HG32FE10DC98VB", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoSmart Pro Wi-Fi Thermostat: Energy Efficiency Meets Smart Technology**  \n\nUpgrade your home’s comfort and save energy with the EcoSmart Pro Wi-Fi Thermostat. Designed for modern living, this sleek and intuitive thermostat lets you take control of your heating and cooling while minimizing energy waste. Whether you're", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-555eee66-ff77-8899-0011-223344556677", "status_code": 200}, "error": null}

Cleanup

Clean up Konnect environment

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

Destroy the Kong Gateway container

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d

Copied!

Send batch requests to Azure OpenAI LLMs

Prerequisites

Kong Konnect

Kong Gateway running

decK v1.43+

Required entities

Azure OpenAI

Batch .jsonl file

Configure AI Proxy plugins for /files route

Configure AI Proxy plugins for /batches route

Upload a .jsonl file for batching

Create a batching request

Check batching status

Retrieve batched responses

Cleanup

Clean up Konnect environment

Destroy the Kong Gateway container

Help us make these docs great!

Still need help