You can proxy requests to Gemini Vertex AI models through AI Gateway using the AI Proxy and AI Proxy Advanced plugins. This reference documents all supported AI capabilities, configuration requirements, and provider-specific details needed for proper integration.
Vertex AI provider
Upstream paths
AI Gateway automatically routes requests to the appropriate Gemini Vertex API endpoints. The following table shows the upstream paths used for each capability.
| Capability | Upstream path or API |
|---|---|
| Chat completions | Uses generateContent API |
| Completions | Uses generateContent API |
| Embeddings | Uses generateContent API |
| Function calling | Uses generateContent API with function declarations |
| Files | /openai/files |
| Batches | Uses batchPredictionJobs API |
| Image generations | Uses generateContent API |
| Image edits | Uses generateContent API |
| Video generations | Uses predictLongRunning API |
Supported capabilities
The following tables show the AI capabilities supported by Gemini Vertex provider when used with the AI Proxy or the AI Proxy Advanced plugin.
Set the plugin’s
route_typebased on the capability you want to use. See the tables below for supported route types.
Text generation
Support for Gemini Vertex basic text generation capabilities including chat, completions, and embeddings:
| Capability | Route type | Streaming | Model example | Min version |
|---|---|---|---|---|
| Chat completions | llm/v1/chat |
gemini-2.5-flash | 3.8 | |
| Completions | llm/v1/completions |
gemini-2.5-flash | 3.8 | |
| Embeddings | llm/v1/embeddings |
text-embedding-004 | 3.11 |
Advanced text generation
Support for Gemini Vertex function calling to allow Gemini Vertex models to use external tools and APIs:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Function calling | llm/v1/chat |
gemini-2.5-flash | 3.8 |
Processing
Support for Gemini Vertex file operations, batch operations, assistants, and response handling:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Files1 | llm/v1/files |
n/a | 3.11 |
| Batches | llm/v1/batches |
n/a | 3.13 |
1 Gemini Vertex does not have a dedicated Files API. File storage uses Google Cloud Storage, similar to AWS S3.
Image
Support for Gemini Vertex image generation and editing capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Generations | image/v1/images/generations |
gemini-2.5-flash-preview-image-generation | 3.11 |
| Edits | image/v1/images/edits |
gemini-2.5-flash-preview-image-generation | 3.11 |
For requests with large payloads, consider increasing
config.max_request_body_sizeto three times the raw binary size.Supported image sizes and formats vary by model. Refer to your provider’s documentation for allowed dimensions and requirements.
Video
Support for Gemini Vertex video generation capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Generations | video/v1/videos/generations |
veo-3.1-generate-001 | 3.13 |
For requests with large payloads (video generation), consider increasing
config.max_request_body_sizeto three times the raw binary size.
Gemini Vertex base URL
The base URL is https://aiplatform.googleapis.com/, where {route_type_path} is determined by the capability.
AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Gemini Vertex-compatible endpoint, in which case set the upstream_url plugin option.
Supported native LLM formats for Gemini Vertex
By default, the AI Proxy plugin uses OpenAI-compatible request formats. Set config.llm_format to a native format to use Gemini Vertex-specific APIs and features.
The following native Gemini Vertex APIs are supported:
| LLM format | Supported APIs |
|---|---|
gemini |
|
Configure Gemini Vertex with AI Proxy
To use Gemini Vertex with AI Gateway, configure the AI Proxy or AI Proxy Advanced.
Here’s a minimal configuration for chat completions:
For more configuration options and examples, see: