Cohere’s document-grounded chat filters candidate documents and generates answers in a single API call. Send a query with candidate documents. The model selects relevant documents, generates an answer using only those documents, and returns citations linking answer segments to sources. This replaces multi-step RAG pipelines with one request.
The following script sends a query with 5 candidate documents to Cohere’s chat endpoint. Three documents discuss green tea health benefits. Two documents are intentionally irrelevant (Eiffel Tower, Python programming).
The script attempts to show which documents the model used by comparing the documents field in the response to the input documents. This demonstrates whether Cohere’s document-grounded chat filters out irrelevant documents automatically.
Create the script:
cat > grounded-chat-demo.py << 'EOF'
#!/usr/bin/env python3
"""Demonstrate document filtering in Cohere grounded chat"""
import requests
import json
CHAT_URL = "http://localhost:8000/rerank"
print("Cohere Document Filtering Demo")
print("=" * 60)
query = "What are the health benefits of drinking green tea?"
documents = [
{"text": "Green tea contains powerful antioxidants called catechins that may help reduce inflammation and protect cells from damage."},
{"text": "The Eiffel Tower is a wrought-iron lattice tower located in Paris, France, and is one of the most recognizable structures in the world."},
{"text": "Studies suggest that regular green tea consumption may boost metabolism and support weight management."},
{"text": "Python is a high-level programming language known for its simplicity and readability, widely used in data science and web development."},
{"text": "Green tea has been associated with improved brain function and may reduce the risk of neurodegenerative diseases."}
]
print(f"\nQuery: {query}\n")
# Show input documents
print("--- INPUT: All Candidate Documents ---")
for idx, doc in enumerate(documents, 1):
print(f"{idx}. {doc['text']}")
# Send request
response = requests.post(
CHAT_URL,
headers={"Content-Type": "application/json"},
json={
"model": "command-a-03-2025",
"query": query,
"documents": documents,
"return_documents": True
}
)
result = response.json()
# Extract document IDs that were used
used_doc_ids = set()
if 'documents' in result:
for doc in result['documents']:
# Map returned docs back to original indices
for idx, orig_doc in enumerate(documents):
if doc['text'] == orig_doc['text']:
used_doc_ids.add(idx)
# Show relevant documents
print("\n--- OUTPUT: Relevant Documents (Used in answer) ---")
if 'documents' in result:
for doc in result['documents']:
print(f"✓ {doc['text']}")
# Show filtered documents
print("\n--- FILTERED OUT: Irrelevant Documents ---")
for idx, doc in enumerate(documents):
if idx not in used_doc_ids:
print(f"✗ {doc['text']}")
# Show answer with citations
print("\n--- GENERATED ANSWER ---")
print(result.get('text', ''))
if 'citations' in result:
print("\n--- CITATIONS ---")
for citation in result['citations']:
print(f"- \"{citation['text']}\" → {citation['document_ids']}")
print("\n" + "=" * 60)
EOF
Verify that the return_documents parameter actually returns the filtered document subset. Check Cohere’s API documentation or test the script to confirm this behavior.