Set up AI Proxy with Ollama

Uses: Kong Gateway AI Gateway deck
Tags
Related Resources
Minimum Version
Kong Gateway - 3.6
TL;DR

Create a Gateway Service and a Route, then enable the AI Proxy plugin and configure it with the Ollama provider, and the llama2 model.

Prerequisites

This is a Konnect tutorial and requires a Konnect personal access token.

  1. Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.

  2. Export your token to an environment variable:

     export KONNECT_TOKEN='YOUR_KONNECT_PAT'
    
  3. Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:

     curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
    

    This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:

     export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
     export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
     export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
     export KONNECT_PROXY_URL='http://localhost:8000'
    

    Copy and paste these into your terminal to configure your session.

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

  1. Export your license to an environment variable:

     export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'
    
  2. Run the quickstart script:

    curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA 
    

    Once Kong Gateway is ready, you will see the following message:

     Kong Gateway Ready
    

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial you will first need to install decK.

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

  1. Run the following command:

    echo '
    _format_version: "3.0"
    services:
      - name: example-service
        url: http://httpbin.konghq.com/anything
    routes:
      - name: example-route
        paths:
        - "/anything"
        service:
          name: example-service
    ' | deck gateway apply -
    

To learn more about entities, you can read our entities documentation.

To complete this tutorial, make sure you have Ollama installed and running locally.

  1. Visit the Ollama download page and download the installer for your operating system. Follow the installation instructions for your platform.

  2. Start Ollama:
    ollama start
    
  3. After installation, open a new terminal window and run:

    ollama run llama2
    

    You can replace llama2 with the model you want to run, such as llama3.

  4. To set up the AI Proxy plugin, you’ll need the upstream URL of your local Llama instance.

    export DECK_OLLAMA_UPSTREAM_URL='http://localhost:11434/api/chat'
    

    By default, Ollama runs at localhost:11434. You can verify this by running:

    lsof -i :11434
    
    • You should see output similar to:
    COMMAND   PID            USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
    ollama   23909  your_user_name   4u  IPv4 0x...            0t0  TCP localhost:11434 (LISTEN)
    
    • If Ollama is running on a different port, run:
    sudo lsof -iTCP -sTCP:LISTEN -n -P
    
    • Then look for the ollama process in the output and note the port number it’s listening on.

    If you’re running Kong Gateway locally in a Docker container, export your upstream URL as http://host.docker.internal:11434

Configure the plugin

Set up the AI Proxy Advanced plugin to route chat requests to Ollama’s Llama2 model by configuring the model options, including the ollama format and the upstream_url pointing to your local Ollama instance.

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    config:
      route_type: llm/v1/chat
      model:
        provider: llama2
        name: llama2
        options:
          llama2_format: ollama
          upstream_url: "${{ env "DECK_OLLAMA_UPSTREAM_URL" }}"
' | deck gateway apply -

Validate

Send a request to the Route to validate.

 curl -X POST "$KONNECT_PROXY_URL/anything" \
     -H "Accept: application/json"\
     -H "Content-Type: application/json" \
     --json '{
       "messages": [
         {
           "role": "system",
           "content": "You are a mathematician"
         },
         {
           "role": "user",
           "content": "What is 1+1?"
         }
       ]
     }'
 curl -X POST "http://localhost:8000/anything" \
     -H "Accept: application/json"\
     -H "Content-Type: application/json" \
     --json '{
       "messages": [
         {
           "role": "system",
           "content": "You are a mathematician"
         },
         {
           "role": "user",
           "content": "What is 1+1?"
         }
       ]
     }'

Cleanup

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d
Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!