Managed cache for Redis

Uses: Kong Gateway

Managed cache sizing recommendations

You can choose from the following cache sizes:

micro: ~0.5 GiB capacity
small: ~1 GiB capacity
medium: ~3 GiB capacity
large: ~6 GiB capacity
xlarge: ~12 GiB capacity
2xlarge: ~25 GiB capacity
4xlarge: ~52 GiB capacity
8xlarge: ~100 GiB capacity
12xlarge: ~150 GiB capacity
16xlarge: ~200 GiB capacity
24xlarge: ~300 GiB capacity

Contact Kong to enable cache tiers

Specific cache sizes must be enabled on your account. Contact your Kong support team to enable a specific cache size before you create or upgrade one.

When sizing workloads, plan for approximately 70–75% of total managed cache memory to be available for cache data. The platform reserves around 25% of each managed cache instance for operational needs, such as replication, failover, and memory management, so the usable cache capacity will be less than the total provisioned size.

To choose the right cache size, you’ll need to know your Redis key count, which determines your cache pressure. This is driven by the following equation:

For example, if you have 5,000 Consumers, 3,000 Routes, and 3 windows, this produces a theoretical key space of 45 million counters per window cycle, each needing a periodic sync to Redis. The sync rate determines how aggressively these counters are pushed, and the cache instance must absorb both the read (fetch counters) and write (push diffs) load.

The following table describes which cache size you should use based on your entity count (Consumers and Routes), rate limit windows, and target number of requests per second (RPS):

Deployment profile	Entities (Consumers × Routes × Windows)	Target RPS	Recommended minimum instance	Recommended sync rate	Notes
Small/Dev/Test	≤100 × ≤100 × 1 window	≤1,000	`cache.t3.small`	0.5	Micro fails at 10K RPS. Small handles 1K RPS baseline cleanly.
Standard enterprise	≤1,000 × ≤100 × 3 windows	≤10,000	`cache.t3.medium`	0.5	–
Large enterprise	≤5,000 × ≤3,000 × 3 windows	≤10,000	`cache.m5.xlarge`	0.5–1.0	Large instances are overwhelmed at 0.1 sync rate with this entity count. xLarge provides headroom.
High-scale enterprise	≤5,000 × ≤3,000 × 3 windows	≤20,000	`cache.m5.2xlarge`	0.5–1.0	–
Ultra-high-scale	5,000 × >3,000 × 3 windows	≤65,000	`cache.m5.4xlarge`	0.5	At this tier, it’s critical that the base RPS you configured for the Dedicated Cloud Gateway is accurate to your production traffic.

Sync rate recommendations

The sync rate is the most impactful tuning lever and interacts directly with cache sizing:

Sync rate	Syncs per second	Notes
0.1	10	Highest Redis command load. Only viable on `cache.m5.xlarge` or larger when entity counts exceed 1,000 Consumers. On smaller instances, it causes cache CPU saturation, Redis timeout cascades, and data plane node restarts. Only use this when sub-second rate limiting accuracy is business-critical. If you must use sync rate 0.1 for accuracy, size up the cache by at least one tier beyond what the entity count alone would suggest. If you can tolerate sync rate 0.5, you can use a smaller cache instance.
0.5	2	Recommended default for production. Best balance of accuracy and resource efficiency. Stable across all instance types for standard workloads. For high-entity deployments, this works well on `cache.m5.large` and above.
1.0	1	Lowest Redis load, but introduces rate limiting accuracy degradation. At high entity counts, the rate limited percentage drops to 57–60% (expected: ~99%), which allows requests through that should be blocked. Only use for non-critical or approximate rate limiting at very low entity counts.

Configure a managed cache

Managed caches are either created at the control plane or control plane group-level.

Important: Dedicated Cloud Gateway control plane groups are gated by a feature flag. To enable the feature flag and use Dedicated Cloud Gateway control plane groups, contact your customer success team.

To create a managed cache at the control plane level, do the following:

List your existing Dedicated Cloud Gateway control planes:

curl -X GET "https://global.api.konghq.com/v2/control-planes?filter%5Bcloud_gateway%5D=true" \
     --no-progress-meter --fail-with-body  \
     -H "Authorization: Bearer $KONNECT_TOKEN"

Copied!

Copy and export the control plane you want to configure the managed cache for:
```
export CONTROL_PLANE_ID='YOUR CONTROL PLANE ID'
```
Copied!

Create a managed cache using the Cloud Gateways add-ons API:

curl -X POST "https://global.api.konghq.com/v2/cloud-gateways/add-ons" \
     --no-progress-meter --fail-with-body  \
     -H "Authorization: Bearer $KONNECT_TOKEN" \
     --json '{
       "name": "managed-cache",
       "owner": {
         "kind": "control-plane",
         "control_plane_id": "'$CONTROL_PLANE_ID'",
         "control_plane_geo": "us"
       },
       "config": {
         "kind": "managed-cache.v0",
         "capacity_config": {
           "kind": "tiered",
           "tier": "micro"
         }
       }
     }'

Copied!

All regions are supported and you can configure the managed cache for multiple regions.

Export the ID of your managed cache from the response:
```
export MANAGED_CACHE_ID='YOUR MANAGED CACHE ID'
```
Copied!

Check the status of the managed cache. Once the state is marked as ready, the cache is ready to use:

curl -X GET "https://global.api.konghq.com/v2/cloud-gateways/add-ons/$MANAGED_CACHE_ID" \
     --no-progress-meter --fail-with-body  \
     -H "Authorization: Bearer $KONNECT_TOKEN"

Copied!

This can take about 15 minutes.

Add the konnect_cloud_gateway_addon resource to your Terraform configuration:

echo '
resource "konnect_gateway_control_plane" "test_cp" {
  name         = "CGW Control Plane"
  cloud_gateway = true
}
resource "konnect_cloud_gateway_addon" "managed_cache" {
  name = "managed-cache"
  owner = {
    control_plane = {
      control_plane_id  = konnect_gateway_control_plane.test_cp.id
      control_plane_geo = "us"
    }
  }
  config = {
    managed_cache = {
      capacity_config = {
        tiered = {
          tier = "micro"
        }
      }
    }
  }
}
' >> main.tf

Copied!

All regions are supported and you can configure the managed cache for multiple regions.

Apply the configuration:
```
terraform apply
```
Copied!

For control plane managed caches, you don’t need to manually configure a Redis partial. After the managed cache is ready, Konnect automatically creates a Redis partial configuration for you. Use the Redis configuration to set up Redis-supported plugins by selecting the automatically created Konnect-managed Redis configuration.

You can’t use the Redis partial configuration in custom plugins. Instead, use env referenceable fields directly.

Resize a managed cache

Managed caches cannot be downsized

You can only upgrade the size of a managed cache, you can’t downsize one. If you want to downsize a cache, you must delete and recreate it.

Before you resize a managed cache, consider the following:

Resizes happen immediately.
Schedule cache resizes during low traffic hours.
Caches remain online during a resize, but you may experience brief interruptions of a few seconds.

You can resize a managed cache by sending a PATCH request to the /cloud-gateways/add-ons/{addOnId} endpoint:

curl -X PATCH "https://global.api.konghq.com/v2/cloud-gateways/add-ons/$MANAGED_CACHE_ID" \
     --no-progress-meter --fail-with-body  \
     -H "Authorization: Bearer $KONNECT_TOKEN" \
     --json '{
       "config": {
         "kind": "managed-cache.v0",
         "capacity_config": {
           "kind": "tiered",
           "tier": "small"
         }
       }
     }'

Copied!

Next Steps

Configure an AWS managed cache for a Dedicated Cloud Gateway control plane

Configure an Azure managed cache for a Dedicated Cloud Gateway control plane

Dedicated Cloud Gateways production readiness checklist

Managed cache for Redis

Managed cache sizing recommendations

Sync rate recommendations

Configure a managed cache

Resize a managed cache

Next Steps

Help us make these docs great!

Still need help?