Managed cache for Redis

Uses: Kong Gateway

Getting started with managed cache?

For complete tutorials, see the following:

A managed cache for Dedicated Cloud Gateways is a Redis-compatible datastore that powers all Redis-enabled plugins. This is fully managed by Kong in the region and provider of your choice, so you don’t have to host Redis infrastructure. Using a managed cache allows you to get up and running faster with Redis-backed plugins, such as Proxy Caching, Rate Limiting, AI Rate Limiting, ACME, and so on.

Only AWS and Azure are currently supported as providers.

Managed cache sizing recommendations

You can choose from the following cache sizes:

  • micro: ~0.5 GiB capacity
  • small: ~1 GiB capacity
  • medium: ~3 GiB capacity
  • large: ~6 GiB capacity
  • xlarge: ~12 GiB capacity
  • 2xlarge: ~25 GiB capacity
  • 4xlarge: ~52 GiB capacity
  • 8xlarge: ~100 GiB capacity
  • 12xlarge: ~150 GiB capacity
  • 16xlarge: ~200 GiB capacity
  • 24xlarge: ~300 GiB capacity

Contact Kong to enable cache tiers

Specific cache sizes must be enabled on your account. Contact your Kong support team to enable a specific cache size before you create or upgrade one.

When sizing workloads, plan for approximately 70–75% of total managed cache memory to be available for cache data. The platform reserves around 25% of each managed cache instance for operational needs, such as replication, failover, and memory management, so the usable cache capacity will be less than the total provisioned size.

To choose the right cache size, you’ll need to know your Redis key count, which determines your cache pressure. This is driven by the following equation:

For example, if you have 5,000 Consumers, 3,000 Routes, and 3 windows, this produces a theoretical key space of 45 million counters per window cycle, each needing a periodic sync to Redis. The sync rate determines how aggressively these counters are pushed, and the cache instance must absorb both the read (fetch counters) and write (push diffs) load.

The following table describes which cache size you should use based on your entity count (Consumers and Routes), rate limit windows, and target number of requests per second (RPS):

Deployment profile

Entities (Consumers × Routes × Windows)

Target RPS

Recommended minimum instance

Recommended sync rate

Notes

Small/Dev/Test ≤100 × ≤100 × 1 window ≤1,000 cache.t3.small 0.5 Micro fails at 10K RPS. Small handles 1K RPS baseline cleanly.
Standard enterprise ≤1,000 × ≤100 × 3 windows ≤10,000 cache.t3.medium 0.5
Large enterprise ≤5,000 × ≤3,000 × 3 windows ≤10,000 cache.m5.xlarge 0.5–1.0 Large instances are overwhelmed at 0.1 sync rate with this entity count. xLarge provides headroom.
High-scale enterprise ≤5,000 × ≤3,000 × 3 windows ≤20,000 cache.m5.2xlarge 0.5–1.0
Ultra-high-scale
5,000 × >3,000 × 3 windows
≤65,000 cache.m5.4xlarge 0.5 At this tier, it’s critical that the base RPS you configured for the Dedicated Cloud Gateway is accurate to your production traffic.

Sync rate recommendations

The sync rate is the most impactful tuning lever and interacts directly with cache sizing:

Sync rate

Syncs per second

Notes

0.1 10 Highest Redis command load. Only viable on cache.m5.xlarge or larger when entity counts exceed 1,000 Consumers. On smaller instances, it causes cache CPU saturation, Redis timeout cascades, and data plane node restarts.

Only use this when sub-second rate limiting accuracy is business-critical.

If you must use sync rate 0.1 for accuracy, size up the cache by at least one tier beyond what the entity count alone would suggest. If you can tolerate sync rate 0.5, you can use a smaller cache instance.
0.5 2 Recommended default for production. Best balance of accuracy and resource efficiency. Stable across all instance types for standard workloads. For high-entity deployments, this works well on cache.m5.large and above.
1.0 1 Lowest Redis load, but introduces rate limiting accuracy degradation. At high entity counts, the rate limited percentage drops to 57–60% (expected: ~99%), which allows requests through that should be blocked. Only use for non-critical or approximate rate limiting at very low entity counts.

Configure a managed cache

Managed caches are either created at the control plane or control plane group-level.

Important: Dedicated Cloud Gateway control plane groups are gated by a feature flag. To enable the feature flag and use Dedicated Cloud Gateway control plane groups, contact your customer success team.

To create a managed cache at the control plane level, do the following:

For control plane managed caches, you don’t need to manually configure a Redis partial. After the managed cache is ready, Konnect automatically creates a Redis partial configuration for you. Use the Redis configuration to set up Redis-supported plugins by selecting the automatically created Konnect-managed Redis configuration.

You can’t use the Redis partial configuration in custom plugins. Instead, use env referenceable fields directly.

Resize a managed cache

Managed caches cannot be downsized

You can only upgrade the size of a managed cache, you can’t downsize one. If you want to downsize a cache, you must delete and recreate it.

Before you resize a managed cache, consider the following:

  • Resizes happen immediately.
  • Schedule cache resizes during low traffic hours.
  • Caches remain online during a resize, but you may experience brief interruptions of a few seconds.

You can resize a managed cache by sending a PATCH request to the /cloud-gateways/add-ons/{addOnId} endpoint:

curl -X PATCH "https://global.api.konghq.com/v2/cloud-gateways/add-ons/$MANAGED_CACHE_ID" \
     --no-progress-meter --fail-with-body  \
     -H "Authorization: Bearer $KONNECT_TOKEN" \
     --json '{
       "config": {
         "kind": "managed-cache.v0",
         "capacity_config": {
           "kind": "tiered",
           "tier": "small"
         }
       }
     }'

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!