AI Rate Limiting Advanced

AI License Required

Request prompt function with Google Cloud Memorystore cluster auth

Protect your LLM services with rate limiting and Google Cloud Memorystore cluster auth. The AI Rate Limiting Advanced plugin will analyze query costs and token response to provide an enterprise-grade rate limiting strategy.

The following example uses request prompt rate limiting, which lets you you rate limit requests based on a custom token. See the how-to guide for a step-by-step walkthrough.

Prerequisites

Environment variables

  • CLUSTER_ADDRESS: The Memorystore cluster address.

  • GCP_SERVICE_ACCOUNT: The GCP service account JSON.

Set up the plugin

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!