AI Rate Limiting Advanced: Request prompt function with AWS ElastiCache cluster auth - Plugin

Request prompt function with AWS ElastiCache cluster auth

Protect your LLM services with rate limiting and AWS ElastiCache cluster auth. The AI Rate Limiting Advanced plugin will analyze query costs and token response to provide an enterprise-grade rate limiting strategy.

The following example uses request prompt rate limiting, which lets you you rate limit requests based on a custom token. See the how-to guide for a step-by-step walkthrough.

Prerequisites

AI Proxy plugin or AI Proxy Advanced plugin configured with an LLM service
A running Redis instance on an AWS ElastiCache cluster for Valkey 7.2 or later or ElastiCache for Redis OSS version 7.0 or later
Port 6379, or your custom Redis port is open and reachable from Kong Gateway.
The ElastiCache user needs to set “Authentication mode” to “IAM”

The following policy assigned to the IAM user/IAM role that is used to connect to the ElastiCache:

{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Effect": "Allow",
          "Action": [
              "elasticache:Connect"
          ],
          "Resource": [
              "arn:aws:elasticache:ARN_OF_THE_ELASTICACHE",
              "arn:aws:elasticache:ARN_OF_THE_ELASTICACHE_USER"
          ]
      }
  ]
}

Copied!

Environment variables

CLUSTER_ADDRESS: The ElastiCache cluster address.
CLUSTER_USERNAME: The ElastiCache username with IAM Auth mode configured.
AWS_CACHE_NAME: Name of your AWS ElastiCache instance.
AWS_REGION: Your AWS ElastiCache instance region.
AWS_ACCESS_KEY_ID: (Optional) Your AWS access key ID.
AWS_ACCESS_SECRET_KEY: (Optional) Your AWS secret access key.

Set up the plugin

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    config:
      strategy: redis
      redis:
        cluster_nodes:
        - ip: ${{ env "DECK_CLUSTER_ADDRESS" }}
          port: 6379
        username: ${{ env "DECK_CLUSTER_USERNAME" }}
        port: 6379
        cloud_authentication:
          auth_provider: aws
          aws_cache_name: ${{ env "DECK_AWS_CACHE_NAME" }}
          aws_is_serverless: false
          aws_region: ${{ env "DECK_AWS_REGION" }}
          aws_access_key_id: ${{ env "DECK_AWS_ACCESS_KEY_ID" }}
          aws_secret_access_key: ${{ env "DECK_AWS_ACCESS_SECRET_KEY" }}
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make the following request:

curl -i -X POST http://localhost:8001/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
  labels:
    global: 'true'
config:
  strategy: redis
  redis:
    cluster_nodes:
    - ip: '$CLUSTER_ADDRESS'
      port: 6379
    username: '$CLUSTER_USERNAME'
    port: 6379
    cloud_authentication:
      auth_provider: aws
      aws_cache_name: '$AWS_CACHE_NAME'
      aws_is_serverless: false
      aws_region: '$AWS_REGION'
      aws_access_key_id: '$AWS_ACCESS_KEY_ID'
      aws_secret_access_key: '$AWS_ACCESS_SECRET_KEY'
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      cluster_nodes = [
        {
          ip = var.cluster_address
          port = 6379
        }      ]
      username = var.cluster_username
      port = 6379

      cloud_authentication = {
        auth_provider = "aws"
        aws_cache_name = var.aws_cache_name
        aws_is_serverless = false
        aws_region = var.aws_region
        aws_access_key_id = var.aws_access_key_id
        aws_secret_access_key = var.aws_access_secret_key
      }
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}

Copied!

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "aws_access_secret_key" {
  type = string
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    service: serviceName|Id
    config:
      strategy: redis
      redis:
        cluster_nodes:
        - ip: ${{ env "DECK_CLUSTER_ADDRESS" }}
          port: 6379
        username: ${{ env "DECK_CLUSTER_USERNAME" }}
        port: 6379
        cloud_authentication:
          auth_provider: aws
          aws_cache_name: ${{ env "DECK_AWS_CACHE_NAME" }}
          aws_is_serverless: false
          aws_region: ${{ env "DECK_AWS_REGION" }}
          aws_access_key_id: ${{ env "DECK_AWS_ACCESS_KEY_ID" }}
          aws_secret_access_key: ${{ env "DECK_AWS_ACCESS_SECRET_KEY" }}
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

serviceName|Id: The id or name of the service the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/services/{serviceName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

serviceName|Id: The id or name of the service the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
serviceId: The id of the service the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    cluster_nodes:
    - ip: '$CLUSTER_ADDRESS'
      port: 6379
    username: '$CLUSTER_USERNAME'
    port: 6379
    cloud_authentication:
      auth_provider: aws
      aws_cache_name: '$AWS_CACHE_NAME'
      aws_is_serverless: false
      aws_region: '$AWS_REGION'
      aws_access_key_id: '$AWS_ACCESS_KEY_ID'
      aws_secret_access_key: '$AWS_ACCESS_SECRET_KEY'
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the service resource:

kubectl annotate -n kong service SERVICE_NAME konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      cluster_nodes = [
        {
          ip = var.cluster_address
          port = 6379
        }      ]
      username = var.cluster_username
      port = 6379

      cloud_authentication = {
        auth_provider = "aws"
        aws_cache_name = var.aws_cache_name
        aws_is_serverless = false
        aws_region = var.aws_region
        aws_access_key_id = var.aws_access_key_id
        aws_secret_access_key = var.aws_access_secret_key
      }
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  service = {
    id = konnect_gateway_service.my_service.id
  }
}

Copied!

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "aws_access_secret_key" {
  type = string
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    route: routeName|Id
    config:
      strategy: redis
      redis:
        cluster_nodes:
        - ip: ${{ env "DECK_CLUSTER_ADDRESS" }}
          port: 6379
        username: ${{ env "DECK_CLUSTER_USERNAME" }}
        port: 6379
        cloud_authentication:
          auth_provider: aws
          aws_cache_name: ${{ env "DECK_AWS_CACHE_NAME" }}
          aws_is_serverless: false
          aws_region: ${{ env "DECK_AWS_REGION" }}
          aws_access_key_id: ${{ env "DECK_AWS_ACCESS_KEY_ID" }}
          aws_secret_access_key: ${{ env "DECK_AWS_ACCESS_SECRET_KEY" }}
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

routeName|Id: The id or name of the route the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/routes/{routeName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

routeName|Id: The id or name of the route the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/routes/{routeId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
routeId: The id of the route the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    cluster_nodes:
    - ip: '$CLUSTER_ADDRESS'
      port: 6379
    username: '$CLUSTER_USERNAME'
    port: 6379
    cloud_authentication:
      auth_provider: aws
      aws_cache_name: '$AWS_CACHE_NAME'
      aws_is_serverless: false
      aws_region: '$AWS_REGION'
      aws_access_key_id: '$AWS_ACCESS_KEY_ID'
      aws_secret_access_key: '$AWS_ACCESS_SECRET_KEY'
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the httproute or ingress resource:

kubectl annotate -n kong httproute  konghq.com/plugins=ai-rate-limiting-advanced

Copied!

kubectl annotate -n kong ingress  konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      cluster_nodes = [
        {
          ip = var.cluster_address
          port = 6379
        }      ]
      username = var.cluster_username
      port = 6379

      cloud_authentication = {
        auth_provider = "aws"
        aws_cache_name = var.aws_cache_name
        aws_is_serverless = false
        aws_region = var.aws_region
        aws_access_key_id = var.aws_access_key_id
        aws_secret_access_key = var.aws_access_secret_key
      }
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  route = {
    id = konnect_gateway_route.my_route.id
  }
}

Copied!

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "aws_access_secret_key" {
  type = string
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    consumer: consumerName|Id
    config:
      strategy: redis
      redis:
        cluster_nodes:
        - ip: ${{ env "DECK_CLUSTER_ADDRESS" }}
          port: 6379
        username: ${{ env "DECK_CLUSTER_USERNAME" }}
        port: 6379
        cloud_authentication:
          auth_provider: aws
          aws_cache_name: ${{ env "DECK_AWS_CACHE_NAME" }}
          aws_is_serverless: false
          aws_region: ${{ env "DECK_AWS_REGION" }}
          aws_access_key_id: ${{ env "DECK_AWS_ACCESS_KEY_ID" }}
          aws_secret_access_key: ${{ env "DECK_AWS_ACCESS_SECRET_KEY" }}
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

consumerName|Id: The id or name of the consumer the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/consumers/{consumerName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

consumerName|Id: The id or name of the consumer the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumers/{consumerId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
consumerId: The id of the consumer the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    cluster_nodes:
    - ip: '$CLUSTER_ADDRESS'
      port: 6379
    username: '$CLUSTER_USERNAME'
    port: 6379
    cloud_authentication:
      auth_provider: aws
      aws_cache_name: '$AWS_CACHE_NAME'
      aws_is_serverless: false
      aws_region: '$AWS_REGION'
      aws_access_key_id: '$AWS_ACCESS_KEY_ID'
      aws_secret_access_key: '$AWS_ACCESS_SECRET_KEY'
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the KongConsumer resource:

kubectl annotate -n kong kongconsumer CONSUMER_NAME konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      cluster_nodes = [
        {
          ip = var.cluster_address
          port = 6379
        }      ]
      username = var.cluster_username
      port = 6379

      cloud_authentication = {
        auth_provider = "aws"
        aws_cache_name = var.aws_cache_name
        aws_is_serverless = false
        aws_region = var.aws_region
        aws_access_key_id = var.aws_access_key_id
        aws_secret_access_key = var.aws_access_secret_key
      }
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer = {
    id = konnect_gateway_consumer.my_consumer.id
  }
}

Copied!

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "aws_access_secret_key" {
  type = string
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    consumer_group: consumerGroupName|Id
    config:
      strategy: redis
      redis:
        cluster_nodes:
        - ip: ${{ env "DECK_CLUSTER_ADDRESS" }}
          port: 6379
        username: ${{ env "DECK_CLUSTER_USERNAME" }}
        port: 6379
        cloud_authentication:
          auth_provider: aws
          aws_cache_name: ${{ env "DECK_AWS_CACHE_NAME" }}
          aws_is_serverless: false
          aws_region: ${{ env "DECK_AWS_REGION" }}
          aws_access_key_id: ${{ env "DECK_AWS_ACCESS_KEY_ID" }}
          aws_secret_access_key: ${{ env "DECK_AWS_ACCESS_SECRET_KEY" }}
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

consumerGroupName|Id: The id or name of the consumer group the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/consumer_groups/{consumerGroupName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

consumerGroupName|Id: The id or name of the consumer group the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumer_groups/{consumerGroupId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "cluster_nodes": [
            {
              "ip": "'$CLUSTER_ADDRESS'",
              "port": 6379
            }
          ],
          "username": "'$CLUSTER_USERNAME'",
          "port": 6379,
          "cloud_authentication": {
            "auth_provider": "aws",
            "aws_cache_name": "'$AWS_CACHE_NAME'",
            "aws_is_serverless": false,
            "aws_region": "'$AWS_REGION'",
            "aws_access_key_id": "'$AWS_ACCESS_KEY_ID'",
            "aws_secret_access_key": "'$AWS_ACCESS_SECRET_KEY'"
          }
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
consumerGroupId: The id of the consumer group the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    cluster_nodes:
    - ip: '$CLUSTER_ADDRESS'
      port: 6379
    username: '$CLUSTER_USERNAME'
    port: 6379
    cloud_authentication:
      auth_provider: aws
      aws_cache_name: '$AWS_CACHE_NAME'
      aws_is_serverless: false
      aws_region: '$AWS_REGION'
      aws_access_key_id: '$AWS_ACCESS_KEY_ID'
      aws_secret_access_key: '$AWS_ACCESS_SECRET_KEY'
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the KongConsumerGroup resource:

kubectl annotate -n kong kongconsumergroup CONSUMERGROUP_NAME konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      cluster_nodes = [
        {
          ip = var.cluster_address
          port = 6379
        }      ]
      username = var.cluster_username
      port = 6379

      cloud_authentication = {
        auth_provider = "aws"
        aws_cache_name = var.aws_cache_name
        aws_is_serverless = false
        aws_region = var.aws_region
        aws_access_key_id = var.aws_access_key_id
        aws_secret_access_key = var.aws_access_secret_key
      }
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer_group = {
    id = konnect_gateway_consumer_group.my_consumer_group.id
  }
}

Copied!

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "aws_access_secret_key" {
  type = string
}

Copied!

AI Rate Limiting Advanced

Request prompt function with AWS ElastiCache cluster auth

Prerequisites

Environment variables

Set up the plugin

Help us make these docs great!

Still need help