Retrievers
RAG retrieval methodologies — choose how relevant chunks are found for each query.
BaseRetriever ABC
All retrievers implement BaseRetriever. The retrieve() method returns raw results; query() and query_with_sources() also run the generator.
from abc import ABC, abstractmethod
from cognity_ai.models import RetrievalResult
class BaseRetriever(ABC):
@abstractmethod
def retrieve(self, query: str, top_k: int = 10) -> list[RetrievalResult]:
"""Return ranked list of relevant chunks — no answer generation."""
...
def query(self, question: str, top_k: int = 10) -> str:
"""Retrieve context + generate a natural-language answer."""
...
def query_with_sources(
self, question: str, top_k: int = 10
) -> dict:
"""
Returns:
{
"answer": str,
"sources": list[dict], # doc_id, page, score
"chunks": list[RetrievalResult]
}
"""
...
from cognity_ai.models import RetrievalResult
@dataclass
class RetrievalResult:
chunk_id: str
doc_id: str
text: str
score: float # relevance score (higher = more relevant)
metadata: dict # page_number, section, source_path, etc.
retrieval_channel: str # "vector" | "graph" | "community" | "bridge"
Overview Table
| Key | Requires | Default? | Best For |
|---|---|---|---|
hybrid_graph |
Graph store + vector store | YES (graph available) | Multi-hop reasoning, entity-rich knowledge bases |
naive |
Vector store only | YES (no graph) | Quick setup, unstructured documents |
vector_only |
Vector store only | Opt-in | Pure semantic similarity search |
graph_only |
Graph store only | Opt-in | Graph-centric structured queries |
parent_child |
Vector store + ParentChildChunker | Opt-in | Long documents, context preservation |
multi_query |
Vector store + LLM | Opt-in | Complex or ambiguous questions |
microsoft_graphrag |
MS GraphRAG store | Opt-in | MS-style global/local graph search |
adaptive |
All (uses subset per query) | Opt-in | Unknown query type, general-purpose routing |
Methodologies
The flagship retriever. Runs 4 independent retrieval channels in parallel and fuses results with Reciprocal Rank Fusion (RRF). Requires a graph store (Neo4j, Memgraph, etc.).
Config: top_k_per_channel=10, rrf_k=60, graph_hops=2
Pure embedding similarity search — single-channel vector retrieval. The fastest and simplest option. No graph store required. Auto-selected when no graph store is configured or available.
Identical to naive but explicitly forces vector-only retrieval even when a graph store is configured. Use when you want the speed of vector search while keeping the graph store available for other operations.
Graph traversal only — no vector search. Extracts entities from the query, traverses the knowledge graph, and returns linked chunks. Best for entity-lookup questions like "What did Company X acquire?" where graph structure is the answer.
Small child chunks are retrieved by embedding similarity (precise match), then the corresponding large parent chunks are returned as context to the LLM. Best for long documents where you need pinpoint retrieval but broad surrounding context. Requires ingestion with chunker="parent_child".
# Must pair chunker + retriever
rag = RAGLibrary(chunker="parent_child", rag_method="parent_child")
Generates N rephrased query variants using the LLM, runs vector retrieval for each, deduplicates results, and re-ranks. Significantly improves recall for ambiguous, multi-faceted, or poorly-worded questions. Slower due to multiple retrieval + LLM calls.
Config: n_queries=3, dedup_threshold=0.95
Wraps the official microsoft/graphrag library. Supports two search modes:
- Local search — entity neighborhood traversal (factual, specific queries)
- Global search — community summary aggregation (broad, thematic queries)
Requires cognity-ai[microsoft-graphrag] and the MicrosoftGraphRAGStore graph store.
Classifies the query type using the LLM and routes to the best retriever:
- Factual / entity lookup →
hybrid_graph - Broad / thematic → community search channel
- Simple semantic →
naive - Ambiguous / complex →
multi_query
Adds one LLM call per query for classification. Best for general-purpose assistants where query types are unpredictable.
Auto-Fallback Logic
cognity-ai automatically degrades gracefully when required backends are unavailable:
naive (warning logged)
graphrag package not installed
→
falls back to hybrid_graph (warning logged)
sentence_transformers (Anthropic has no embeddings API)
rag.health_report() after init to see which retrievers and channels are active given your current backend configuration.
Per-Query Method Override
The default rag_method can be overridden for any individual query without changing the pipeline configuration:
from cognity_ai import RAGLibrary
rag = RAGLibrary(rag_method="hybrid_graph") # default
# Override for a specific query
answer = rag.query("List all companies mentioned", method="graph_only")
summary = rag.query("Give a broad thematic overview", method="microsoft_graphrag")
precise = rag.query("What is the boiling point of water?", method="naive")
detailed = rag.query_with_sources("Who founded Anthropic?", method="hybrid_graph")
# Retrieve without generating an answer
chunks = rag.retrieve("transformer architecture", top_k=5, method="multi_query")
for chunk in chunks:
print(f"[{chunk.score:.3f}] {chunk.text[:120]}...")
Direct Usage
Instantiate and use a retriever directly without RAGLibrary:
from cognity_ai.retrievers import HybridGraphRetriever, NaiveRetriever
from cognity_ai.stores.vector import ChromaStore
from cognity_ai.stores.graph import Neo4jStore
from cognity_ai.embedders import GeminiEmbedder
from cognity_ai.generators import GeminiGenerator
embedder = GeminiEmbedder(api_key="AIza...")
vector_store = ChromaStore(persist_directory=".chroma")
graph_store = Neo4jStore(
uri="bolt://localhost:7687",
user="neo4j",
password="password",
)
# 4-channel hybrid retrieval
retriever = HybridGraphRetriever(
vector_store=vector_store,
graph_store=graph_store,
embedder=embedder,
generator=GeminiGenerator(api_key="AIza..."),
top_k_per_channel=10,
rrf_k=60,
graph_hops=2,
)
# Retrieve only
results = retriever.retrieve("Who founded the company?")
for r in results:
print(f"[{r.retrieval_channel}] score={r.score:.3f}: {r.text[:80]}")
# Retrieve + generate answer
answer = retriever.query("What are the main products?")
# Retrieve + answer + sources
result = retriever.query_with_sources("Summarize the Q3 results")
print(result["answer"])
for src in result["sources"]:
print(f" {src['doc_id']} p.{src['page']} (score={src['score']:.3f})")
# Naive retriever (no graph needed)
naive = NaiveRetriever(
vector_store=vector_store,
embedder=embedder,
generator=GeminiGenerator(api_key="AIza..."),
)
results = naive.retrieve("climate change impacts", top_k=5)