Use Google Generative AI SDK for Vertex AI service chats with AI Gateway

Uses: Kong Gateway AI Gateway deck

Deployment Platform

konnect

on-prem

Prerequisites

Kong Konnect

This is a Konnect tutorial and requires a Konnect personal access token.

Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.
Export your token to an environment variable:
```
 export KONNECT_TOKEN='YOUR_KONNECT_PAT'
```
Copied!
Run the quickstart script to automatically provision a Control Plane and Data Plane, and configure your environment:
```
 curl -Ls https://get.konghq.com/quickstart | bash -s -- -k $KONNECT_TOKEN --deck-output
```
Copied!
This sets up a Konnect Control Plane named quickstart, provisions a local Data Plane, and prints out the following environment variable exports:
```
 export DECK_KONNECT_TOKEN=$KONNECT_TOKEN
 export DECK_KONNECT_CONTROL_PLANE_NAME=quickstart
 export KONNECT_CONTROL_PLANE_URL=https://us.api.konghq.com
 export KONNECT_PROXY_URL='http://localhost:8000'
```
Copied!
Copy and paste these into your terminal to configure your session.

Kong Gateway running

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

Export your license to an environment variable:

 export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'

Copied!

Run the quickstart script:

curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA

Copied!

Once Kong Gateway is ready, you will see the following message:

 Kong Gateway Ready

decK v1.43+

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial, install decK version 1.43 or later.

This guide uses deck gateway apply, which directly applies entity configuration to your Gateway instance. We recommend upgrading your decK installation to take advantage of this tool.

You can check your current decK version with deck version.

Required entities

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

Run the following command:

echo '
_format_version: "3.0"
services:
  - name: gemini-service
    url: http://httpbin.konghq.com/
routes:
  - name: gemini-route
    paths:
    - "/gemini"
    service:
      name: gemini-service
' | deck gateway apply -

Copied!

To learn more about entities, you can read our entities documentation.

Vertex AI

Before you begin, you must get the following credentials from Google Cloud:

Service Account Key: A JSON key file for a service account with Vertex AI permissions
Project ID: Your Google Cloud project identifier
Location ID: Your Google Cloud project location identifier
API Endpoint: The global Vertex AI API endpoint https://aiplatform.googleapis.com

Then export your credentials as environment variables:

export DECK_GCP_SERVICE_ACCOUNT_JSON="<single-line-escaped-json>"
export DECK_GCP_LOCATION_ID="<your_location_id>"
export DECK_GCP_API_ENDPOINT="<your_gcp_api_endpoint>"
export DECK_GCP_PROJECT_ID="<your-gcp-project-id>"

Copied!

Set up GCP Application Default Credentials (ADC) with your quota project:

gcloud auth application-default set-quota-project <your_gcp_project_id>

Copied!

Replace <your_gcp_project_id> with your actual project ID. This configures ADC to use your project for API quota and billing.

Python

To complete this tutorial, you’ll need Python (version 3.7 or later) and pip installed on your machine. You can verify it by running:

python3
python3 -m pip --version

Copied!

Create a virtual env:
```
python3 -m venv myenv
```
Copied!
Activate it:
```
source myenv/bin/activate
```
Copied!

Google Generative AI SDK

Install the Google Generative AI SDK:

pip install google-generativeai

Copied!

Configure the AI Proxy Advanced plugin

The AI Proxy Advanced plugin supports Google’s Vertex AI models with service account authentication. This configuration allows you to route requests in Vertex AI’s native format through AI Gateway. The plugin handles authentication with GCP, manages the connection to Vertex AI endpoints, and proxies requests without modifying the Gemini-specific request structure.

Apply the plugin configuration with your GCP service account credentials:

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    service: gemini-service
    config:
      llm_format: gemini
      genai_category: text/generation
      targets:
      - route_type: llm/v1/chat
        logging:
          log_payloads: false
          log_statistics: true
        model:
          provider: gemini
          name: gemini-2.0-flash-exp
          options:
            gemini:
              api_endpoint: "${{ env "DECK_GCP_API_ENDPOINT" }}"
              project_id: "${{ env "DECK_GCP_PROJECT_ID" }}"
              location_id: "${{ env "DECK_GCP_LOCATION_ID" }}"
        auth:
          allow_override: false
          gcp_use_service_account: true
          gcp_service_account_json: |-
            ${{ env "DECK_GCP_SERVICE_ACCOUNT_JSON" }}
' | deck gateway apply -

Copied!

Create Python script

Create a test script that sends a request using Vertex AI’s native API format. The script constructs the Vertex AI endpoint URL with your project ID and location, then sends a properly formatted request:

cat << 'EOF' > vertex.py
#!/usr/bin/env python3
import os
from google import genai
import sys
import time
import threading

def spinner():
    chars = ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏']
    idx = 0
    while not stop_spinner:
        sys.stdout.write(f'\r{chars[idx % len(chars)]} Generating response...')
        sys.stdout.flush()
        idx += 1
        time.sleep(0.1)
    sys.stdout.write('\r' + ' ' * 30 + '\r')
    sys.stdout.flush()

client = genai.Client(
    vertexai=True,
    project=os.environ.get("DECK_GCP_PROJECT_ID", "gcp-sdet-test"),
    location=os.environ.get("DECK_GCP_LOCATION_ID", "us-central1"),
    http_options={
        "base_url": "http://localhost:8000/gemini"
    }
)

stop_spinner = False
spinner_thread = threading.Thread(target=spinner)
spinner_thread.start()

try:
    response = client.models.generate_content(
        model="gemini-2.0-flash-exp",
        contents="Hello! Say hello back to me!"
    )
    stop_spinner = True
    spinner_thread.join()
    print(f"Model: {response.model_version}")
    print(response.text)
except Exception as e:
    stop_spinner = True
    spinner_thread.join()
    print(f"Error: {e}")
EOF

Copied!

Validate the configuration

Now, let’s run the script we created in the previous step:

python3 vertex.py

Copied!

Expected output:

Hello there!

Copied!

Cleanup

Clean up Konnect environment

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

Destroy the Kong Gateway container

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d

Copied!

Use Google Generative AI SDK for Vertex AI service chats with AI Gateway

Prerequisites

Kong Konnect

Kong Gateway running

decK v1.43+

Required entities

Vertex AI

Python

Google Generative AI SDK

Configure the AI Proxy Advanced plugin

Create Python script

Validate the configuration

Cleanup

Clean up Konnect environment

Destroy the Kong Gateway container

Help us make these docs great!

Still need help