AI Rate Limiting Advanced

AI License Required

Enable LLM model rate limitingv3.14+

Protect your LLM services with model rate limiting. The AI Rate Limiting Advanced plugin will analyze query costs and token response to provide an enterprise-grade rate limiting strategy.

The following example uses GPT 5.1, but you can apply the same strategies to any LLM model.

Prerequisites

Set up the plugin

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!