AI Rate Limiting Advanced

AI License Required

Request prompt function with AWS ElastiCache cluster auth

Protect your LLM services with rate limiting and AWS ElastiCache cluster auth. The AI Rate Limiting Advanced plugin will analyze query costs and token response to provide an enterprise-grade rate limiting strategy.

The following example uses request prompt rate limiting, which lets you you rate limit requests based on a custom token. See the how-to guide for a step-by-step walkthrough.

Prerequisites

  • AI Proxy plugin or AI Proxy Advanced plugin configured with an LLM service

  • A running Redis instance on an AWS ElastiCache cluster for Valkey 7.2 or later or ElastiCache for Redis OSS version 7.0 or later

  • Port 6379, or your custom Redis port is open and reachable from Kong Gateway.

  • The ElastiCache user needs to set “Authentication mode” to “IAM”

  • The following policy assigned to the IAM user/IAM role that is used to connect to the ElastiCache:

    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Effect": "Allow",
              "Action": [
                  "elasticache:Connect"
              ],
              "Resource": [
                  "arn:aws:elasticache:ARN_OF_THE_ELASTICACHE",
                  "arn:aws:elasticache:ARN_OF_THE_ELASTICACHE_USER"
              ]
          }
      ]
    }
    

Environment variables

  • CLUSTER_ADDRESS: The ElastiCache cluster address.

  • CLUSTER_USERNAME: The ElastiCache username with IAM Auth mode configured.

  • AWS_CACHE_NAME: Name of your AWS ElastiCache instance.

  • AWS_REGION: Your AWS ElastiCache instance region.

  • AWS_ACCESS_KEY_ID: (Optional) Your AWS access key ID.

  • AWS_ACCESS_SECRET_KEY: (Optional) Your AWS secret access key.

Set up the plugin

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!