Send batch requests to Azure OpenAI LLMs

Deployment Platform
Tags
Related Resources
Minimum Version
Kong Gateway - 3.11
TL;DR

Package your prompts into a JSONL file and upload it to the /files endpoint. Then launch a batch job with /batches to process everything asynchronously, and download the output from /files once the run completes.

Prerequisites

This is a Konnect tutorial and requires a Konnect personal access token.

  1. Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.

  2. Export your token to an environment variable:

     export KONNECT_TOKEN='YOUR_KONNECT_PAT'
    
  3. Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:

     curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
    

    This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:

     export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
     export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
     export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
     export KONNECT_PROXY_URL='http://localhost:8000'
    

    Copy and paste these into your terminal to configure your session.

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

  1. Export your license to an environment variable:

     export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'
    
  2. Run the quickstart script:

    curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA 
    

    Once Kong Gateway is ready, you will see the following message:

     Kong Gateway Ready
    

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.

This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance. We recommend upgrading your decK installation to take advantage of this tool.

You can check your current decK version with deck version.

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

  1. Run the following command:

    echo '
    _format_version: "3.0"
    services:
      - name: files-service
        url: http://httpbin.konghq.com/files
      - name: batches-service
        url: http://httpbin.konghq.com/batches
    routes:
      - name: files-route
        paths:
        - "/files"
        service:
          name: files-service
      - name: batches-route
        paths:
        - "/batches"
        service:
          name: batches-service
    ' | deck gateway apply -
    

To learn more about entities, you can read our entities documentation.

This tutorial uses Azure OpenAI service. Configure it as follows:

  1. Create an Azure account.
  2. In the Azure Portal, click Create a resource.
  3. Search for Azure OpenAI and select Azure OpenAI Service.
  4. Configure your Azure resource.
  5. Export your instance name:
    export DECK_AZURE_INSTANCE_NAME='YOUR_AZURE_RESOURCE_NAME'
    
  6. Deploy your model in Azure AI Foundry:
    1. Go to My assets → Models and deployments → Deploy model.

    Use a globalbatch or datazonebatch deployment type for batch operations since standard deployments (GlobalStandard) cannot process batch files.

    1. Export the API key and deployment ID: bash export DECK_AZURE_OPENAI_API_KEY='YOUR_AZURE_OPENAI_MODEL_API_KEY' export DECK_AZURE_DEPLOYMENT_ID='YOUR_AZURE_OPENAI_DEPLOYMENT_NAME'

To complete this tutorial, create a batch.jsonl to generate asynchronous batched LLM responses. We use /v1/chat/completions because it handles chat-based generation requests, instructing the LLM to produce conversational completions in batch mode.

Run the following command to create the file:

cat <<EOF > batch.jsonl
{"custom_id": "prod1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a compelling product description for a solar-powered smart garden light."}], "max_tokens": 60}}
{"custom_id": "prod2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a product description for an energy-efficient smart thermostat for home use."}], "max_tokens": 60}}
{"custom_id": "prod3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write an engaging product description for a biodegradable bamboo kitchen utensil set."}], "max_tokens": 60}}
{"custom_id": "prod4", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a detailed product description for a water-saving smart shower head."}], "max_tokens": 60}}
{"custom_id": "prod5", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a concise product description for a compact indoor air purifier that uses natural filters."}], "max_tokens": 60}}
EOF

Configure AI Proxy plugins for /files route

Let’s create an AI Proxy plugin for the llm/v1/files route type. It will be used to handle the upload and retrieval of JSONL files containing batch input and output data. This plugin instance ensures that input data is correctly staged for batch processing and that the results can be downloaded once the batch job completes.

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    service: files-service
    config:
      model_name_header: false
      route_type: llm/v1/files
      auth:
        header_name: Authorization
        header_value: Bearer ${{ env "DECK_AZURE_OPENAI_API_KEY" }}
      model:
        provider: azure
        options:
          azure_api_version: 2025-01-01-preview
          azure_instance: "${{ env "DECK_AZURE_INSTANCE_NAME" }}"
          azure_deployment_id: "${{ env "DECK_AZURE_DEPLOYMENT_ID" }}"
' | deck gateway apply -

Configure AI Proxy plugins for /batches route

Next, create an AI Proxy plugin for the llm/v1/batches route. This plugin manages the submission, monitoring, and retrieval of asynchronous batch jobs. It communicates with Azure OpenAI’s batch deployment to process multiple LLM requests in a batch.

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    service: batches-service
    config:
      model_name_header: false
      route_type: llm/v1/batches
      auth:
        header_name: Authorization
        header_value: Bearer ${{ env "DECK_AZURE_OPENAI_API_KEY" }}
      model:
        provider: azure
        options:
          azure_api_version: 2025-01-01-preview
          azure_instance: "${{ env "DECK_AZURE_INSTANCE_NAME" }}"
          azure_deployment_id: "${{ env "DECK_AZURE_DEPLOYMENT_ID" }}"
' | deck gateway apply -

Upload a .jsonl file for batching

Now, let’s use the following command to upload our batching file to the /llm/v1/files route:

curl "$KONNECT_PROXY_URL/files" \
     -F purpose="batch" \
     -F file="@batch.jsonl" 
curl "http://localhost:8000/files" \
     -F purpose="batch" \
     -F file="@batch.jsonl" 

Once processed, you will see a JSON response like this:

{
  "status": "processed",
  "bytes": 1648,
  "purpose": "batch",
  "filename": "batch.jsonl",
  "id": "file-da4364d8fd714dd9b29706b91236ab02",
  "created_at": 1761817541,
  "object": "file"
}

Now, let’s export the file ID:

export FILE_ID=YOUR_FILE_ID

Create a batching request

Now, we can send a POST request to the /batches Route to create a batch using our uploaded file:

The completion window must be set to 24h, as it’s the only value currently supported by the OpenAI /batches API.

In this example we use the /v1/chat/completions route for batching because we are sending multiple structured chat-style prompts in OpenAI’s chat completions format to be processed in bulk.

curl "$KONNECT_PROXY_URL/batches" \
     --json '{
       "input_file_id": "'$FILE_ID'",
       "endpoint": "/v1/chat/completions",
       "completion_window": "24h"
     }'
curl "http://localhost:8000/batches" \
     --json '{
       "input_file_id": "'$FILE_ID'",
       "endpoint": "/v1/chat/completions",
       "completion_window": "24h"
     }'

You will receive a response similar to:

{
  "cancelled_at": null,
  "cancelling_at": null,
  "completed_at": null,
  "completion_window": "24h",
  "created_at": 1761817562,
  "error_file_id": "",
  "expired_at": null,
  "expires_at": 1761903959,
  "failed_at": null,
  "finalizing_at": null,
  "id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
  "in_progress_at": null,
  "input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
  "errors": null,
  "metadata": null,
  "object": "batch",
  "output_file_id": "",
  "request_counts": {
    "total": 0,
    "completed": 0,
    "failed": 0
  },
  "status": "validating",
  "endpoint": ""
}

Copy the batch ID from this response to check the batch status and export it as an environment variable by running the following command in your terminal:

export BATCH_ID=YOUR_BATCH_ID

Check batching status

Wait for a moment for the batching request to be completed, then check the status of your batch by sending the following request:

curl "$KONNECT_PROXY_URL/batches/$BATCH_ID"
curl "http://localhost:8000/batches/$BATCH_ID"

A completed batch response looks like this:

{
  "cancelled_at": null,
  "cancelling_at": null,
  "completed_at": 1761817685,
  "completion_window": "24h",
  "created_at": 1761817562,
  "error_file_id": null,
  "expired_at": null,
  "expires_at": 1761903959,
  "failed_at": null,
  "finalizing_at": 1761817662,
  "id": "batch_379f1007-8057-4f43-be38-12f3d456c7da",
  "in_progress_at": null,
  "input_file_id": "file-da4364d8fd714dd9b29706b91236ab02",
  "errors": null,
  "metadata": null,
  "object": "batch",
  "output_file_id": "file-93d91f55-0418-abcd-1234-81f4bb334951",
  "request_counts": {
    "total": 5,
    "completed": 5,
    "failed": 0
  },
  "status": "completed",
  "endpoint": "/v1/chat/completions"
}

You can notice The "request_counts" object shows that all five requests in the batch were successfully completed ("completed": 5, "failed": 0).

Now, you can copy the output_file_id to retrieve your batched responses and export it as environment variable:

export OUTPUT_FILE_ID=YOUR_OUTPUT_FILE_ID

The output file ID will only be available once the batch request has completed. If the status is "in_progress", it won’t be set yet.

Retrieve batched responses

Now, we can download the batched responses from the /files endpoint by appending /content to the file ID URL. For details, see the OpenAI API documentation.

curl http://localhost:8000/files/$OUTPUT_FILE_ID/content > batched-response.jsonl

This command saves the batched responses to the batched-response.jsonl file.

The batched response file contains one JSON object per line, each representing a single batched request’s response. Here is an example of content from batched-response.jsonl which contains the individual completion results for each request we submitted in the batch input file:

{"custom_id": "prod4", "response": {"body": {"id": "chatcmpl-AB12CD34EF56GH78IJ90KL12MN", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoFlow Smart Shower Head: Revolutionize Your Daily Routine While Saving Water**\n\nExperience the perfect blend of luxury, sustainability, and smart technology with the **EcoFlow Smart Shower Head** — a cutting-edge solution for modern households looking to conserve water without compromising on comfort. Designed to elevate your shower experience", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-111aaa22-bb33-cc44-dd55-ee66ff778899", "status_code": 200}, "error": null}
{"custom_id": "prod3", "response": {"body": {"id": "chatcmpl-ZX98YW76VU54TS32RQ10PO98LK", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Eco-Friendly Elegance: Biodegradable Bamboo Kitchen Utensil Set**\n\nElevate your cooking experience while making a positive impact on the planet with our **Biodegradable Bamboo Kitchen Utensil Set**. Crafted from 100% natural, sustainably sourced bamboo, this set combines durability, functionality", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-222bbb33-cc44-dd55-ee66-ff7788990011", "status_code": 200}, "error": null}
{"custom_id": "prod1", "response": {"body": {"id": "chatcmpl-MN34OP56QR78ST90UV12WX34YZ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**Illuminate Your Garden with Brilliance: The Solar-Powered Smart Garden Light**  \n\nTransform your outdoor space into a haven of sustainable beauty with the **Solar-Powered Smart Garden Light**—a perfect blend of modern innovation and eco-friendly design. Powered entirely by the sun, this smart light delivers effortless", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 30, "total_tokens": 90}, "system_fingerprint": "fp_random1234"},"request_id": "req-333ccc44-dd55-ee66-ff77-889900112233", "status_code": 200}, "error": null}
{"custom_id": "prod5", "response": {"body": {"id": "chatcmpl-AQ12WS34ED56RF78TG90HY12UJ", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "Breathe easy with our compact indoor air purifier, designed to deliver fresh and clean air using natural filters. This eco-friendly purifier quietly removes allergens, dust, and odors without synthetic materials, making it perfect for any small space. Stylish, efficient, and sustainable—experience pure air, naturally.", "refusal": null, "annotations": []}, "finish_reason": "stop", "logprobs": null}], "usage": {"completion_tokens": 59, "prompt_tokens": 33, "total_tokens": 92}, "system_fingerprint": "fp_random1234"},"request_id": "req-444ddd55-ee66-ff77-8899-001122334455", "status_code": 200}, "error": null}
{"custom_id": "prod2", "response": {"body": {"id": "chatcmpl-PO98LK76JI54HG32FE10DC98VB", "object": "chat.completion", "created": 1761909664, "model": "gpt-4o-2024-11-20", "choices": [{"index": 0, "message": {"role": "assistant", "content": "**EcoSmart Pro Wi-Fi Thermostat: Energy Efficiency Meets Smart Technology**  \n\nUpgrade your home’s comfort and save energy with the EcoSmart Pro Wi-Fi Thermostat. Designed for modern living, this sleek and intuitive thermostat lets you take control of your heating and cooling while minimizing energy waste. Whether you're", "refusal": null, "annotations": []}, "finish_reason": "length", "logprobs": null}], "usage": {"completion_tokens": 60, "prompt_tokens": 31, "total_tokens": 91}, "system_fingerprint": "fp_random1234"},"request_id": "req-555eee66-ff77-8899-0011-223344556677", "status_code": 200}, "error": null}

Cleanup

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d
Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!