How to: Route Claude CLI traffic through AI Gateway and HuggingFace

Prerequisites

Kong Konnect

This is a Konnect tutorial and requires a Konnect personal access token.

Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.
Export your token to an environment variable:
```
 export KONNECT_TOKEN='YOUR_KONNECT_PAT'
```
Copied!
Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:
```
 curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
```
Copied!
This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:
```
 export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
 export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
 export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
 export KONNECT_PROXY_URL='http://localhost:8000'
```
Copied!
Copy and paste these into your terminal to configure your session.

Kong Gateway running

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

Export your license to an environment variable:

 export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'

Copied!

Run the quickstart script:

curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA

Copied!

Once Kong Gateway is ready, you will see the following message:

 Kong Gateway Ready

decK v1.43+

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.

This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance. We recommend upgrading your decK installation to take advantage of this tool.

You can check your current decK version with deck version.

Required entities

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

Run the following command:

echo '
_format_version: "3.0"
services:
  - name: example-service
    url: http://httpbin.konghq.com/anything
routes:
  - name: example-route
    paths:
    - "/anything"
    service:
      name: example-service
' | deck gateway apply -

Copied!

To learn more about entities, you can read our entities documentation.

HuggingFace

You need an active HuggingFace account with API access. Sign up at HuggingFace and obtain your API token from the Access Tokens page. Ensure you have access to the HuggingFace Inference API, and export your token to your environment:

export DECK_HUGGINGFACE_API_TOKEN='YOUR HUGGINGFACE API TOKEN'

Copied!

Claude Code CLI

Install Claude:

 curl -fsSL https://claude.ai/install.sh | bash

Copied!

Create or edit the Claude settings file:

 mkdir -p ~/.claude
 nano ~/.claude/settings.json

Copied!

Put this exact content in the file:

 {
     "apiKeyHelper": "~/.claude/anthropic_key.sh"
 }

Copied!

Create the API key helper script:
```
 nano ~/.claude/anthropic_key.sh
```
Copied!
Inside, put a dummy API key:
```
 echo "x"
```
Copied!
Make the script executable:
```
 chmod +x ~/.claude/anthropic_key.sh
```
Copied!
Verify it works by running the script:
```
 ~/.claude/anthropic_key.sh
```
Copied!
You should see only your API key printed.

Configure the Pre-function plugin

Claude CLI automatically includes a model field in its request payload. However, when the AI Proxy plugin is configured with HuggingFace provider and specific model in its settings, this creates a conflict. The pre-function plugin removes the model field from incoming requests before they reach the AI Proxy plugin, ensuring the gateway uses the model you configured rather than the one Claude CLI sends.

echo '
_format_version: "3.0"
plugins:
  - name: pre-function
    config:
      access:
      - |
        local body = kong.request.get_body("application/json", nil, 10485760)
        if not body or body == "" then
          return
        end
        body.model = nil
        kong.service.request.set_body(body, "application/json")
' | deck gateway apply -

Copied!

Configure the AI Proxy plugin

Configure the AI Proxy plugin for the HuggingFace provider. This setup uses the default llm/v1/chat route. Claude Code sends its requests to this route.

The llm_format: anthropic parameter tells AI Gateway to expect request and response payloads that match Claude’s native API format. Without this setting, the gateway would default to OpenAI’s format, which would cause request failures when Claude Code communicates with the HuggingFace endpoint.

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    config:
      llm_format: anthropic
      route_type: llm/v1/chat
      logging:
        log_statistics: true
        log_payloads: false
      auth:
        header_name: Authorization
        header_value: Bearer ${{ env "DECK_HUGGINGFACE_API_TOKEN" }}
      model:
        provider: huggingface
        name: meta-llama/Llama-3.3-70B-Instruct
' | deck gateway apply -

Copied!

Configure the File Log plugin

Enable the File Log plugin on the service to inspect the LLM traffic between Claude and the AI Gateway. This creates a local claude.json file on your machine. The file records each request and response so you can review what Claude sends through the AI Gateway.

echo '
_format_version: "3.0"
plugins:
  - name: file-log
    config:
      path: "/tmp/claude.json"
' | deck gateway apply -

Copied!

Verify traffic through Kong

Start a Claude Code session that points to the local AI Gateway endpoint:

The ANTHROPIC_MODEL value can be any string since the pre-function plugin removes it. The actual model used is meta-llama/Llama-3.3-70B-Instruct as configured in the AI Proxy plugin.

ANTHROPIC_BASE_URL=http://localhost:8000/anything \
ANTHROPIC_MODEL=any-model-name \
claude

Copied!

Claude Code asks for permission before it runs tools or interacts with files:

I'll need permission to work with your files.

This means I can:
- Read any file in this folder
- Create, edit, or delete files
- Run commands (like npm, git, tests, ls, rm)
- Use tools defined in .mcp.json

Learn more ( https://docs.claude.com/s/claude-code-security )

❯ 1. Yes, continue
2. No, exit

Select Yes, continue. The session starts. Ask a simple question to confirm that requests reach AI Gateway.

Try creating a logging.py that logs simple http logs.

Copied!

Claude Code might prompt you to approve its web search for answering the question. When you select Yes, Claude will produce a full-length response to your request:

Create file
╭───────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ logging.py                                                                                            │
│                                                                                                       │
│ import logging                                                                                        │
│                                                                                                       │
│ logging.basicConfig(filename='app.log', filemode='a', format='%(name)s - %(levelname)s -              │
│ %(message)s')                                                                                         │
│                                                                                                       │
│ def log_info(message):                                                                                │
│     logging.info(message)                                                                             │
│                                                                                                       │
│ def log_warning(message):                                                                             │
│     logging.warning(message)                                                                          │
│                                                                                                       │
│ def log_error(message):                                                                               │
│     logging.error(message)                                                                            │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────╯
 Do you want to create logging.py?
 ❯ 1. Yes

Next, inspect the AI Gateway logs to verify that the traffic was proxied through it:

docker exec kong-quickstart-gateway cat /tmp/claude.json | jq

Copied!

You should find an entry that shows the upstream request made by Claude Code. A typical log record looks like this:

{
  ...
  "upstream_uri": "/v1/chat/completions?beta=true",
  "request": {
    "method": "POST",
    "headers": {
      "user-agent": "claude-cli/2.0.58 (external, cli)",
      "content-type": "application/json",
      "anthropic-version": "2023-06-01"
    }
  },
  ...
  "ai": {
    "proxy": {
      "usage": {
        "completion_tokens": 26,
        "completion_tokens_details": {},
        "total_tokens": 178,
        "cost": 0,
        "time_per_token": 52.538461538462,
        "time_to_first_token": 1365,
        "prompt_tokens": 152,
        "prompt_tokens_details": {}
      },
      "meta": {
        "llm_latency": 1366,
        "request_mode": "oneshot",
        "plugin_id": "0000b82c-5826-4abf-93b0-2fa230f5e030",
        "provider_name": "huggingface",
        "response_model": "meta-llama/Llama-3.3-70B-Instruct",
        "request_model": "meta-llama/Llama-3.3-70B-Instruct"
      }
    }
  }
  ...
}

This output confirms that Claude Code routed the request through AI Gateway using HuggingFace with the meta-llama/Llama-3.3-70B-Instruct model.

Cleanup

Clean up Konnect environment

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

Destroy the Kong Gateway container

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d

Copied!

Route Claude CLI traffic through AI Gateway and HuggingFace

Prerequisites

Kong Konnect

Kong Gateway running

decK v1.43+

Required entities

HuggingFace

Claude Code CLI

Configure the Pre-function plugin

Configure the AI Proxy plugin

Configure the File Log plugin

Verify traffic through Kong

Cleanup

Clean up Konnect environment

Destroy the Kong Gateway container

Help us make these docs great!

Still need help