You can proxy requests to Hugging Face AI models through AI Gateway using the AI Proxy and AI Proxy Advanced plugins. This reference documents all supported AI capabilities, configuration requirements, and provider-specific details needed for proper integration.
Hugging Face provider
Upstream paths
AI Gateway automatically routes requests to the appropriate Hugging Face API endpoints. The following table shows the upstream paths used for each capability.
| Capability | Upstream path or API |
|---|---|
| Chat completions | /v1/chat/completions |
| Embeddings | /hf-inference/models/{model_name}/pipeline/feature-extraction |
| Video generations | /v1/videos |
Supported capabilities
The following tables show the AI capabilities supported by Hugging Face provider when used with the AI Proxy or the AI Proxy Advanced plugin.
Set the plugin’s
route_typebased on the capability you want to use. See the tables below for supported route types.
Text generation
Support for Hugging Face basic text generation capabilities including chat, completions, and embeddings:
| Capability | Route type | Streaming | Model example | Min version |
|---|---|---|---|---|
| Chat completions | llm/v1/chat |
Use the model name for the specific LLM provider | 3.9 | |
| Embeddings | llm/v1/embeddings |
Use the embedding model name | 3.11 |
Video
Support for Hugging Face video generation capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Generations | video/v1/videos/generations |
Use the video generation model name | 3.13 |
For requests with large payloads (video generation), consider increasing
config.max_request_body_sizeto three times the raw binary size.
Hugging Face base URL
The base URL is https://api-inference.huggingface.co, where {route_type_path} is determined by the capability.
AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Hugging Face-compatible endpoint, in which case set the upstream_url plugin option.
Supported native LLM formats for Hugging Face
By default, the AI Proxy plugin uses OpenAI-compatible request formats. Set config.llm_format to a native format to use Hugging Face-specific APIs and features.
The following native Hugging Face APIs are supported:
| LLM format | Supported APIs |
|---|---|
huggingface |
|
Configure Hugging Face with AI Proxy
To use Hugging Face with AI Gateway, configure the AI Proxy or AI Proxy Advanced.
Here’s a minimal configuration for chat completions:
For more configuration options and examples, see: