Use the AI Lakera Guard plugin

TL;DR

Configure the AI Proxy Advanced plugin to route requests to any LLM upstream, then apply the AI Lakera Guard plugin to inspect prompts and responses for unsafe content using Lakera’s threat detection service.

Prerequisites

This is a Konnect tutorial and requires a Konnect personal access token.

  1. Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.

  2. Export your token to an environment variable:

     export KONNECT_TOKEN='YOUR_KONNECT_PAT'
    
  3. Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:

     curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
    

    This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:

     export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
     export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
     export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
     export KONNECT_PROXY_URL='http://localhost:8000'
    

    Copy and paste these into your terminal to configure your session.

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

  1. Export your license to an environment variable:

     export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'
    
  2. Run the quickstart script:

    curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA 
    

    Once Kong Gateway is ready, you will see the following message:

     Kong Gateway Ready
    

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.

This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance. We recommend upgrading your decK installation to take advantage of this tool.

You can check your current decK version with deck version.

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

  1. Run the following command:

    echo '
    _format_version: "3.0"
    services:
      - name: example-service
        url: http://httpbin.konghq.com/anything
    routes:
      - name: example-route
        paths:
        - "/anything"
        service:
          name: example-service
    ' | deck gateway apply -
    

To learn more about entities, you can read our entities documentation.

For this task, you need an Anthropic API key.

  1. Create an Anthropic Console account.
  2. Generate an API key from the console settings.
  3. Create a decK variable with your API key:
    export DECK_ANTHROPIC_API_KEY='ANTHROPIC API KEY'
    

To use the AI Lakera Guard plugin, you need an API key from Lakera:

  1. Log in to the Lakera platform.

  2. Navigate to API Keys.

  3. Click Create New API key.

  4. Enter the name for your API key.

  5. Click Create.

  6. Copy your API key.

  7. Go to your terminal and export your API key as an environment variable:

    export DECK_LAKERA_API_KEY='your-api-key-here'
    
  8. Go back to Lakera UI and click Done.

To use the AI Lakera Guard plugin, you need to create a policy and project in Lakera:

Create policy from template:

  1. Go to Policies.

  2. Click New policy button.

  3. Select Public-facing Application template.

  4. Click Create policy.

The Public-facing Application policy includes the following guardrails at Lakera L2 (balanced) threshold:

  • Prompt defense (input and output): Prevents manipulation of LLM models by stopping prompt injection attacks, jailbreaks, and untrusted instructions overriding intended model behavior.
  • Content moderation (input and output)** - Protects users by ensuring harmful or inappropriate content (hate speech, sexual content, profanity, violence, weapons, crime) is not passed into or comes out of your GenAI application.
  • Data leakage prevention (input and output) - Prevents data leaks by ensuring Personally Identifiable Information (PII) or sensitive content is not passed into or comes out of your GenAI application. Detects addresses, credit cards, IP addresses, US social security numbers, and IBANs.
  • Unknown links (output) - Prevents malicious links being shown to users by flagging URLs that aren’t in the top 1 million most popular domains or your custom allowed domain list.

Create project:

  1. Go to Projects.
  2. Click New project button.

  3. Enter the name of your project in the Project details section.

  4. Scroll down to Assign a policy section.

  5. Click the dropdown and select Public-facing Application policy.

  6. Click Save project.

  7. Copy the project ID from the table.

  8. Go to your terminal and export the project ID as an environment variable:

    export DECK_LAKERA_PROJECT='your-project-id-here'
    

Configure the plugin

First, let’s configure the AI Proxy plugin. This plugin forwards requests to the LLM upstream, while the AI Lakera Guard plugin enforces content safety and guardrails on prompts and responses.

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    config:
      route_type: llm/v1/chat
      auth:
        header_name: x-api-key
        header_value: "${{ env "DECK_ANTHROPIC_API_KEY" }}"
      model:
        provider: anthropic
        name: claude-sonnet-4-5-20250929
        options:
          anthropic_version: '2023-06-01'
          max_tokens: 512
          temperature: 1.0
      logging:
        log_statistics: true
        log_payloads: true
' | deck gateway apply -

Configure the AI Lakera Guard plugin

After configuring AI Proxy to route requests to Anthropic LLM, let’s apply the AI Lakera Guard plugin to enforce content safety on prompts and responses. In our example, the plugin is configured to use the project we created earlier and reveal blocked categories when content is filtered by setting reveal_failure_categories to true.

echo '
_format_version: "3.0"
plugins:
  - name: ai-lakera-guard
    config:
      api_key: "${{ env "DECK_LAKERA_API_KEY" }}"
      project_id: "${{ env "DECK_LAKERA_PROJECT" }}"
      reveal_failure_categories: true
' | deck gateway apply -

Validate configuration

Now that the AI Lakera Guard plugin is configured, let’s test different categories of prompts to make sure that the Lakera guardrails are working.

The system blocks prompt categories that you disallow and returns a 403 error message when the plugin detects a violation, including the detector type and a request UUID for traceability.

For more detailed log tracing, configure config.logging in the AI Proxy plugin and use any Kong Gateway logging plugin of your choice.

Prompt Defense

These tests verify that the prompt defense blocks injection attacks and jailbreak attempts.

Content Moderation

These tests ensure that the filter blocks harmful content including hate speech, violence, sexual content, and criminal activity.

Data leakage prevention

These tests check that various types of Personally Identifiable Information (PII) are detected and blocked.

Cleanup

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d
Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!