Llama provider

Uses: Kong Gateway AI Gateway Admin API deck KIC Konnect API Terraform

Upstream paths

AI Gateway automatically routes requests to the appropriate Llama2 API endpoints. The following table shows the upstream paths used for each capability.

Capability	Upstream path or API
Chat completions	`User-defined`
Completions	`User-defined`
Embeddings	`User-defined`

Supported capabilities

The following tables show the AI capabilities supported by Llama2 provider when used with the AI Proxy or the AI Proxy Advanced plugin.

Set the plugin’s route_type based on the capability you want to use. See the tables below for supported route types.

Text generation

Support for Llama2 basic text generation capabilities including chat, completions, and embeddings:

Capability	Route type	Model example	Min version
Chat completions	`llm/v1/chat`	User-defined	3.6
Completions	`llm/v1/completions`	User-defined	3.6
Embeddings	`llm/v1/embeddings`	User-defined	3.11

Llama2 base URL

The base URL is $UPSTREAM_URL, where {route_type_path} is determined by the capability.

AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Llama2-compatible endpoint, in which case set the upstream_url plugin option.

Configure Llama2 with AI Proxy

To use Llama2 with AI Gateway, configure the AI Proxy or AI Proxy Advanced.

Here’s a minimal configuration for chat completions:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-proxy
    config:
      route_type: llm/v1/chat
      model:
        provider: llama2
        name: llama2
        options:
          llama2_format: ollama
          upstream_url: http://llama2-server.local:11434/api/chat

curl -i -X POST http://localhost:8001/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy",
      "config": {
        "route_type": "llm/v1/chat",
        "model": {
          "provider": "llama2",
          "name": "llama2",
          "options": {
            "llama2_format": "ollama",
            "upstream_url": "http://llama2-server.local:11434/api/chat"
          }
        }
      }
    }
    '

Copied!

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy",
      "config": {
        "route_type": "llm/v1/chat",
        "model": {
          "provider": "llama2",
          "name": "llama2",
          "options": {
            "llama2_format": "ollama",
            "upstream_url": "http://llama2-server.local:11434/api/chat"
          }
        }
      }
    }
    '

Copied!

echo "
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: ai-proxy
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: 'true'
config:
  route_type: llm/v1/chat
  model:
    provider: llama2
    name: llama2
    options:
      llama2_format: ollama
      upstream_url: http://llama2-server.local:11434/api/chat
plugin: ai-proxy
" | kubectl apply -f -

Copied!

resource "konnect_gateway_plugin_ai_proxy" "my_ai_proxy" {
  enabled = true

  config = {
    route_type = "llm/v1/chat"

    model = {
      provider = "llama2"
      name = "llama2"

      options = {
        llama2_format = "ollama"
        upstream_url = "http://llama2-server.local:11434/api/chat"
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}

Copied!

For more configuration options and examples, see:

AI Proxy examples

AI Proxy Advanced examples

Llama provider

Upstream paths

Supported capabilities

Text generation

Llama2 base URL

Configure Llama2 with AI Proxy

Tutorials

Help us make these docs great!

Still need help