Documentation

Architecture

How cognity-ai is structured — from file ingestion through graph-augmented retrieval to answer synthesis.

Overview

cognity-ai is a plugin-based, modular RAG library. Every layer of the stack — loaders, chunkers, embedders, vector stores, graph stores, retrievers, generators — is swappable via string keys registered in a central PluginRegistry.

You configure cognity-ai with a LibraryConfig object. The ComponentFactory reads those keys and instantiates the correct implementations, applying a smart fallback chain when a preferred backend is unavailable (e.g., Neo4j offline falls back to NetworkX; a missing Anthropic API key falls back to sentence_transformers).

Plugin-based Smart defaults Auto-detect backends Zero vendor lock-in Incremental ingestion

ℹ️

Design principle All public behaviour is accessed through the RAGLibrary facade in cognity_ai/library.py. Internal components are wired together by ComponentFactory — you never instantiate them directly unless you are writing a plugin.

Ingestion Pipeline

When you call rag.ingest() or rag.ingest_dir(), a document travels through the following stages before landing in the vector store and graph store.

File / Directory
      ↓
LoaderFactory ► Format Loader (PDF, DOCX, XLSX, PPTX, HTML, CSV, JSON, Image...)
      ↓
HybridPageIndex  (structural page numbers + regex fallback)
      ↓
Chunker          (sentence | fixed | semantic | recursive | parent-child | hybrid)
      ↓
NLP Extractor    (spaCy entities, relations, SVO triples)
      ↓ [optional]
LLM Augmenter    (fill semantic gaps via LLM)
      ↓
Embedder         (gemini | openai | vertex_ai | bedrock | cohere | sentence_transformers | ollama)
      ↓
┌─────────────────────┐
│  Vector Store        │  ◄── upsert chunks + community embeddings
│  (chroma/qdrant/...) │
└─────────────────────┘
┌─────────────────────┐
│  Graph Store         │  ◄── upsert entities, relations, chunks
│  (neo4j/networkx/...) │
└─────────────────────┘
      ↓
Hash Store       (SHA-256 → skip unchanged docs on re-ingest)

Loaders

The LoaderFactory selects a format loader based on the file extension. Each loader returns a list of Document objects with raw text, metadata, and (for paginated formats) a page map.

Format	Loader class	Extra dep
PDF	`PDFLoader`	`cognity-ai[pdf]`
DOCX / DOC	`DocxLoader`	`cognity-ai[office]`
XLSX / XLS	`XlsxLoader`	`cognity-ai[office]`
PPTX / PPT	`PptxLoader`	`cognity-ai[office]`
HTML	`HTMLLoader`	built-in
CSV	`CSVLoader`	built-in
JSON / YAML	`JSONLoader`	built-in
Images (PNG, JPG, WEBP...)	`ImageLoader`	`cognity-ai[ocr]`
Audio (MP3, WAV, M4A...)	`AudioLoader`	`cognity-ai[audio]`

Chunking Strategies

Six chunking strategies are available and configurable via LibraryConfig.chunker. The default is hybrid, which combines sentence boundaries with a fixed token ceiling.

sentence fixed semantic recursive parent-child hybrid (default)

Retrieval & Query

At query time, cognity-ai routes the question through either the full HybridGraphRetriever (4-channel) or the lighter NaiveRetriever (vector-only). An optional AdaptiveRetriever can classify queries and choose the right path automatically.

Query String
      ↓
 ┌───────────────────────────────────────────────────┐
 │   AdaptiveRetriever (optional query classifier)    │
 └──────────┬─────────────────────────────┬────────┘
            ↓                               ↓
   HybridGraphRetriever               NaiveRetriever
  ┌──────────────────┐               (vector only)
  │ CH1: Vector      │
  │ CH2: Graph sub   │
  │ CH3: Community   │
  │ CH4: Bridge      │
  └────────┬────────┘
           ↓
    RRF (Reciprocal Rank Fusion)
           ↓
    Generator (LLM answer synthesis)
           ↓
    Answer + Sources

The Four Retrieval Channels

Channel	What it retrieves	Best for
CH1 Vector	Top-k semantically similar chunks from the vector store	Precise factual questions
CH2 Graph sub	1-hop and 2-hop entity subgraph around query entities	Relational / multi-hop reasoning
CH3 Community	Community summary chunks that cover entity clusters	Broad "tell me about X" summaries
CH4 Bridge	Chunks that bridge two otherwise disconnected subgraphs	Cross-domain synthesis

💡

RRF weighting Reciprocal Rank Fusion merges the four channel result lists into a single ranked list with no channel-specific score normalisation required. Channel weights are configurable via LibraryConfig.retriever_weights.

Plugin Registry

PluginRegistry is a central dictionary that maps string keys to implementation classes. ComponentFactory reads LibraryConfig, looks up the requested key, and instantiates the class — applying a fallback chain when the preferred backend is unavailable.

python

# All registries follow the same pattern
PluginRegistry.register_embedder("my_embedder", MyEmbedder)
PluginRegistry.register_vector_store("my_store", MyVectorStore)
PluginRegistry.register_graph_store("my_graph", MyGraphStore)
PluginRegistry.register_retriever("my_retriever", MyRetriever)
PluginRegistry.register_generator("my_gen", MyGenerator)
PluginRegistry.register_chunker("my_chunker", MyChunker)
PluginRegistry.register_loader("my_fmt", ".myfmt", MyLoader)

Fallback Chains

ComponentFactory tests availability at instantiation time. If the preferred backend raises an ImportError or a connection error, it steps to the next option in the fallback chain.

Graph store — when Neo4j is unreachable:

neo4j → memgraph → arangodb → networkx (in-memory)

Embedder — when Anthropic API key is missing:

anthropic → openai → sentence_transformers (local)

Vector store — when Pinecone is unavailable:

pinecone → qdrant → chroma (local)

ℹ️

Disabling fallback Set LibraryConfig.strict_backends = True to raise an error immediately instead of falling back. Useful in production to avoid silent degradation.

Multimodal Path

cognity-ai handles non-text content embedded inside documents. When a DOCX, PPTX, or PDF file contains embedded images, those images are extracted as raw bytes and routed through the OCR subsystem before the text pipeline continues.

DOCX / PPTX / PDF (contains embedded images)
      ↓
Format Loader — extracts image bytes at each page/slide position
      ↓
OCR Subsystem
  ┌───────────────────────────────┐
  │ 1. gemini_vision (multimodal LLM)  │
  │ 2. openai_vision (GPT-4o)          │
  │ 3. anthropic_vision (Claude)        │
  │ 4. aws_textract                     │
  │ 5. azure_vision                     │
  │ 6. tesseract (local fallback)       │
  └───────────────────────────────┘
      ↓
OCR text — injected into the chunk at the image's original position
      ↓
Rejoins the normal ingestion pipeline (NLP → Embed → Store)

Audio Transcription

Audio files and video files with audio tracks are routed through the transcription subsystem before ingestion. Three providers are supported, tried in order of configuration priority:

AWS Transcribe Google STT OpenAI Whisper

Transcribed text is treated as a first-class document and passes through the same chunking, extraction, and embedding stages as any text file.

💡

Experimental flag Multimodal RAG (image embedding, video frame extraction, cross-modal retrieval) is available under LibraryConfig.experimental_multimodal = True. API stability is not guaranteed across minor versions.

Package Structure

The top-level layout of the cognity_ai/ package. Each sub-package contains an __init__.py that re-exports the public interface and a base.py that defines the abstract base class for that component type.

cognity_ai/
├── library.py        # RAGLibrary facade — the only class users touch directly
├── registry.py       # PluginRegistry — global map of string keys to classes
├── factory.py        # ComponentFactory — wires LibraryConfig into live objects
├── models/           # Pydantic dataclasses: Document, Entity, Chunk, RetrievalResult
├── config/           # LibraryConfig + typed provider config dataclasses
├── loaders/          # File format loaders (PDF, DOCX, XLSX, PPTX, HTML, CSV, JSON...)
├── ocr/              # OCR providers (Gemini Vision, GPT-4o, Claude, Tesseract...)
├── chunkers/         # Chunking strategies (sentence, fixed, semantic, hybrid...)
├── page_index/       # Page/slide/sheet number extraction strategies
├── extractors/       # NLP + LLM knowledge extraction (entities, relations, triples)
├── embedders/        # Embedding providers (Gemini, OpenAI, Cohere, ST, Ollama...)
├── generators/       # LLM answer generators (Gemini, OpenAI, Anthropic, Bedrock...)
├── stores/vector/    # Vector store backends (Chroma, Qdrant, Pinecone, FAISS...)
├── stores/graph/     # Graph store backends (Neo4j, NetworkX, Memgraph, ArangoDB...)
├── retrievers/       # RAG methodologies (hybrid-graph, naive, adaptive, dense...)
├── pipeline/         # Ingestion orchestration + incremental knowledge updater
├── multimodal/       # Experimental: image / video / audio RAG
└── utils/            # RRF, SHA-256 hashing, token counting, async helpers

Dependency Groups

cognity-ai uses pyproject.toml optional dependency groups so you only install what you need:

cognity-ai[pdf] cognity-ai[office] cognity-ai[nlp] cognity-ai[openai] cognity-ai[anthropic] cognity-ai[gemini] cognity-ai[neo4j] cognity-ai[qdrant] cognity-ai[audio] cognity-ai[ocr] cognity-ai[all]

Next Steps

Now that you understand the architecture, here's where to go next.

🚀

Get Started

Install cognity-ai and run your first ingestion and query pipeline in minutes.

📖

API Reference

Full documentation for every class, method, and configuration parameter.