Architecture

How cognity-ai is structured — from file ingestion through graph-augmented retrieval to answer synthesis.

Overview

cognity-ai is a plugin-based, modular RAG library. Every layer of the stack — loaders, chunkers, embedders, vector stores, graph stores, retrievers, generators — is swappable via string keys registered in a central PluginRegistry.

You configure cognity-ai with a LibraryConfig object. The ComponentFactory reads those keys and instantiates the correct implementations, applying a smart fallback chain when a preferred backend is unavailable (e.g., Neo4j offline falls back to NetworkX; a missing Anthropic API key falls back to sentence_transformers).

Plugin-based Smart defaults Auto-detect backends Zero vendor lock-in Incremental ingestion
ℹ️
Design principle All public behaviour is accessed through the RAGLibrary facade in cognity_ai/library.py. Internal components are wired together by ComponentFactory — you never instantiate them directly unless you are writing a plugin.

Ingestion Pipeline

When you call rag.ingest() or rag.ingest_dir(), a document travels through the following stages before landing in the vector store and graph store.

File / Directory
      
LoaderFactory  Format Loader (PDF, DOCX, XLSX, PPTX, HTML, CSV, JSON, Image...)
      
HybridPageIndex  (structural page numbers + regex fallback)
      
Chunker          (sentence | fixed | semantic | recursive | parent-child | hybrid)
      
NLP Extractor    (spaCy entities, relations, SVO triples)
       [optional]
LLM Augmenter    (fill semantic gaps via LLM)
      
Embedder         (gemini | openai | vertex_ai | bedrock | cohere | sentence_transformers | ollama)
      
┌─────────────────────┐
│  Vector Store        │  ── upsert chunks + community embeddings
│  (chroma/qdrant/...) │
└─────────────────────┘
┌─────────────────────┐
│  Graph Store         │  ── upsert entities, relations, chunks
│  (neo4j/networkx/...) │
└─────────────────────┘
      
Hash Store       (SHA-256 → skip unchanged docs on re-ingest)

Loaders

The LoaderFactory selects a format loader based on the file extension. Each loader returns a list of Document objects with raw text, metadata, and (for paginated formats) a page map.

FormatLoader classExtra dep
PDFPDFLoadercognity-ai[pdf]
DOCX / DOCDocxLoadercognity-ai[office]
XLSX / XLSXlsxLoadercognity-ai[office]
PPTX / PPTPptxLoadercognity-ai[office]
HTMLHTMLLoaderbuilt-in
CSVCSVLoaderbuilt-in
JSON / YAMLJSONLoaderbuilt-in
Images (PNG, JPG, WEBP...)ImageLoadercognity-ai[ocr]
Audio (MP3, WAV, M4A...)AudioLoadercognity-ai[audio]

Chunking Strategies

Six chunking strategies are available and configurable via LibraryConfig.chunker. The default is hybrid, which combines sentence boundaries with a fixed token ceiling.

sentence fixed semantic recursive parent-child hybrid (default)

Retrieval & Query

At query time, cognity-ai routes the question through either the full HybridGraphRetriever (4-channel) or the lighter NaiveRetriever (vector-only). An optional AdaptiveRetriever can classify queries and choose the right path automatically.

Query String
      
 ┌───────────────────────────────────────────────────┐
 │   AdaptiveRetriever (optional query classifier)    │
 └──────────┬─────────────────────────────┬────────┘
                                           
   HybridGraphRetriever               NaiveRetriever
  ┌──────────────────┐               (vector only)
   CH1: Vector      
   CH2: Graph sub   
   CH3: Community   
   CH4: Bridge      
  └────────┬────────┘
           
    RRF (Reciprocal Rank Fusion)
           
    Generator (LLM answer synthesis)
           
    Answer + Sources

The Four Retrieval Channels

ChannelWhat it retrievesBest for
CH1 Vector Top-k semantically similar chunks from the vector store Precise factual questions
CH2 Graph sub 1-hop and 2-hop entity subgraph around query entities Relational / multi-hop reasoning
CH3 Community Community summary chunks that cover entity clusters Broad "tell me about X" summaries
CH4 Bridge Chunks that bridge two otherwise disconnected subgraphs Cross-domain synthesis
💡
RRF weighting Reciprocal Rank Fusion merges the four channel result lists into a single ranked list with no channel-specific score normalisation required. Channel weights are configurable via LibraryConfig.retriever_weights.

Plugin Registry

PluginRegistry is a central dictionary that maps string keys to implementation classes. ComponentFactory reads LibraryConfig, looks up the requested key, and instantiates the class — applying a fallback chain when the preferred backend is unavailable.

python
# All registries follow the same pattern
PluginRegistry.register_embedder("my_embedder", MyEmbedder)
PluginRegistry.register_vector_store("my_store", MyVectorStore)
PluginRegistry.register_graph_store("my_graph", MyGraphStore)
PluginRegistry.register_retriever("my_retriever", MyRetriever)
PluginRegistry.register_generator("my_gen", MyGenerator)
PluginRegistry.register_chunker("my_chunker", MyChunker)
PluginRegistry.register_loader("my_fmt", ".myfmt", MyLoader)

Fallback Chains

ComponentFactory tests availability at instantiation time. If the preferred backend raises an ImportError or a connection error, it steps to the next option in the fallback chain.

Graph store — when Neo4j is unreachable:

neo4j memgraph arangodb networkx (in-memory)

Embedder — when Anthropic API key is missing:

anthropic openai sentence_transformers (local)

Vector store — when Pinecone is unavailable:

pinecone qdrant chroma (local)
ℹ️
Disabling fallback Set LibraryConfig.strict_backends = True to raise an error immediately instead of falling back. Useful in production to avoid silent degradation.

Multimodal Path

cognity-ai handles non-text content embedded inside documents. When a DOCX, PPTX, or PDF file contains embedded images, those images are extracted as raw bytes and routed through the OCR subsystem before the text pipeline continues.

DOCX / PPTX / PDF (contains embedded images)
      
Format Loader — extracts image bytes at each page/slide position
      
OCR Subsystem
  ┌───────────────────────────────┐
   1. gemini_vision (multimodal LLM)  
   2. openai_vision (GPT-4o)          
   3. anthropic_vision (Claude)        
   4. aws_textract                     
   5. azure_vision                     
   6. tesseract (local fallback)       
  └───────────────────────────────┘
      
OCR text — injected into the chunk at the image's original position
      
Rejoins the normal ingestion pipeline (NLP → Embed → Store)

Audio Transcription

Audio files and video files with audio tracks are routed through the transcription subsystem before ingestion. Three providers are supported, tried in order of configuration priority:

AWS Transcribe Google STT OpenAI Whisper

Transcribed text is treated as a first-class document and passes through the same chunking, extraction, and embedding stages as any text file.

💡
Experimental flag Multimodal RAG (image embedding, video frame extraction, cross-modal retrieval) is available under LibraryConfig.experimental_multimodal = True. API stability is not guaranteed across minor versions.

Package Structure

The top-level layout of the cognity_ai/ package. Each sub-package contains an __init__.py that re-exports the public interface and a base.py that defines the abstract base class for that component type.

cognity_ai/
├── library.py        # RAGLibrary facade — the only class users touch directly
├── registry.py       # PluginRegistry — global map of string keys to classes
├── factory.py        # ComponentFactory — wires LibraryConfig into live objects
├── models/           # Pydantic dataclasses: Document, Entity, Chunk, RetrievalResult
├── config/           # LibraryConfig + typed provider config dataclasses
├── loaders/          # File format loaders (PDF, DOCX, XLSX, PPTX, HTML, CSV, JSON...)
├── ocr/              # OCR providers (Gemini Vision, GPT-4o, Claude, Tesseract...)
├── chunkers/         # Chunking strategies (sentence, fixed, semantic, hybrid...)
├── page_index/       # Page/slide/sheet number extraction strategies
├── extractors/       # NLP + LLM knowledge extraction (entities, relations, triples)
├── embedders/        # Embedding providers (Gemini, OpenAI, Cohere, ST, Ollama...)
├── generators/       # LLM answer generators (Gemini, OpenAI, Anthropic, Bedrock...)
├── stores/vector/    # Vector store backends (Chroma, Qdrant, Pinecone, FAISS...)
├── stores/graph/     # Graph store backends (Neo4j, NetworkX, Memgraph, ArangoDB...)
├── retrievers/       # RAG methodologies (hybrid-graph, naive, adaptive, dense...)
├── pipeline/         # Ingestion orchestration + incremental knowledge updater
├── multimodal/       # Experimental: image / video / audio RAG
└── utils/            # RRF, SHA-256 hashing, token counting, async helpers

Dependency Groups

cognity-ai uses pyproject.toml optional dependency groups so you only install what you need:

cognity-ai[pdf] cognity-ai[office] cognity-ai[nlp] cognity-ai[openai] cognity-ai[anthropic] cognity-ai[gemini] cognity-ai[neo4j] cognity-ai[qdrant] cognity-ai[audio] cognity-ai[ocr] cognity-ai[all]

Next Steps

Now that you understand the architecture, here's where to go next.