Use the AI Custom Guardrail plugin with the Mistral AI Moderation API

Deployment Platform
Minimum Version
Kong Gateway - 3.14
TL;DR

Enable the AI Custom Guardrail plugin with the Mistral AI URL and your API key, then define the parameters to send in your request to the Mistral Moderation API and create functions to parse the response content.

Prerequisites

This is a Konnect tutorial and requires a Konnect personal access token.

  1. Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.

  2. Export your token to an environment variable:

     export KONNECT_TOKEN='YOUR_KONNECT_PAT'
    
  3. Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:

     curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
    

    This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:

     export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
     export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
     export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
     export KONNECT_PROXY_URL='http://localhost:8000'
    

    Copy and paste these into your terminal to configure your session.

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

  1. Export your license to an environment variable:

     export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'
    
  2. Run the quickstart script:

    curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA 
    

    Once Kong Gateway is ready, you will see the following message:

     Kong Gateway Ready
    

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.

This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance. We recommend upgrading your decK installation to take advantage of this tool.

You can check your current decK version with deck version.

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

  1. Run the following command:

    echo '
    _format_version: "3.0"
    services:
      - name: example-service
        url: http://httpbin.konghq.com/anything
    routes:
      - name: example-route
        paths:
        - "/anything"
        service:
          name: example-service
        protocols:
        - http
        - https
    ' | deck gateway apply -
    

To learn more about entities, you can read our entities documentation.

This tutorial uses OpenAI:

  1. Create an OpenAI account.
  2. Get an API key.
  3. Create a decK variable with the API key:

    export DECK_OPENAI_API_KEY='YOUR OPENAI API KEY'
    

This tutorial uses Mistral:

  1. Create a Mistral account.
  2. Get your API key.
  3. Export a decK environment variable with the Mistral API key:

    export DECK_MISTRAL_API_KEY='YOUR MISTRAL API KEY'
    

Configure the AI Proxy plugin

Enable the AI Proxy plugin with your OpenAI API key and the model details to proxy requests to OpenAI. In this example, we’ll use the GPT 5.1 model:

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy
    config:
      route_type: llm/v1/chat
      auth:
        header_name: Authorization
        header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
      model:
        provider: openai
        name: gpt-5.1
' | deck gateway apply -

Configure the AI Custom Guardrail plugin

Enable the AI Custom Guardrail with the following data:

  • The Mistral Moderation API URL
  • Your Mistral API key
  • The Mistral model to use
  • The input content to send to the Mistral Moderation API
  • The function that defines how to parse the response

In this example, the Mistral Moderation API response contains a results array containing a categories object with a list of different moderation categories. If the input matches one of the categories, its value will be true. In the function below, we block the request or response if at least one of the categories is true, and we return the list of categories violated.

echo '
_format_version: "3.0"
plugins:
  - name: ai-custom-guardrail
    config:
      guarding_mode: BOTH
      text_source: concatenate_all_content
      params:
        api_key: "${{ env "DECK_MISTRAL_API_KEY" }}"
        model: mistral-moderation-2411
      request:
        url: https://api.mistral.ai/v1/moderations
        headers:
          Authorization: Bearer $(conf.params.api_key)
        body:
          model: "$(conf.params.model)"
          input: "$(content)"
      response:
        block: "$(check_response.block)"
        block_message: "$(check_response.block_message)"
      functions:
        check_response: "return function(resp)\n    local blocked_categories = {}\n
          \   \n    for _, result in ipairs(resp.results) do\n        for category,
          is_flagged in pairs(result.categories) do\n            if is_flagged then\n
          \               table.insert(blocked_categories, category)\n            end\n
          \       end\n    end\n    \n    local block = #blocked_categories > 0\n    local
          reason\n\n    if block then\n      reason = \"Content moderation failed in
          the following categories: \" .. table.concat(blocked_categories, \", \")\n
          \   else\n      reason = \"Content moderation passed\"\n    end\n    \n    return
          {\n        block = block,\n        block_message = reason\n    }\nend\n"
' | deck gateway apply -

Test the configuration

Using this configuration, send the following AI Chat request that violates a moderation rule:

curl -X POST "$KONNECT_PROXY_URL/anything" \
     --no-progress-meter --fail-with-body  \
     -H "Content-Type: application/json" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Should I take over the world?"
         },
         {
           "role": "assistant",
           "content": "Yes, absolutely!"
         }
       ]
     }'

curl -X POST "http://localhost:8000/anything" \
     --no-progress-meter --fail-with-body  \
     -H "Content-Type: application/json" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Should I take over the world?"
         },
         {
           "role": "assistant",
           "content": "Yes, absolutely!"
         }
       ]
     }'

You should get the following result:

{
   "error":{
      "message":"Content moderation failed in the following categories: dangerous_and_criminal_content"
   }
}

Cleanup

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!