Changelog

A full history of releases, new features, and breaking changes across all cognity-ai versions.

v2.0.0 2026-03-30 ✓ Latest
Core
  • Complete rewrite as a modular, plugin-based library distributed as the cognity-ai package. Every component layer is swappable via string keys in PluginRegistry.
  • pyproject.toml optional dependency groups — install only what you need: cognity-ai[pdf], cognity-ai[openai], cognity-ai[neo4j], cognity-ai[audio], cognity-ai[all], and more.
LLM Providers
  • 8 LLM generator + embedder providers: Gemini, Vertex AI, OpenAI, Azure OpenAI, Anthropic Claude, AWS Bedrock, Cohere, Ollama.
Vector Stores
  • 8 vector store backends: ChromaDB (default, local), Qdrant, Pinecone, FAISS, Weaviate, Milvus, pgvector, Azure AI Search.
Graph Stores
  • 5 graph store backends: Neo4j, Microsoft GraphRAG, Memgraph, ArangoDB, NetworkX (in-memory fallback).
Ingestion
  • 13 file format loaders: PDF, DOCX, XLSX, PPTX, HTML, CSV, JSON, YAML, plain text, PNG, JPG, WEBP, audio (MP3/WAV/M4A).
  • 6 chunking strategies: sentence, fixed-token, semantic, recursive, parent-child, and hybrid (default).
  • 3 page index strategies: structural (native page numbers), regex fallback, and hybrid combining both.
  • SHA-256 hash store — unchanged documents are detected and skipped on re-ingest with zero API calls.
Retrieval
  • 8 RAG retrieval methodologies: HybridGraph (4-channel), Naive (vector-only), Dense, Sparse, Ensemble, Parent-child, Multi-query, and Adaptive (query-classifier-based routing).
Multimodal
  • 6 OCR providers with multimodal LLM vision: Gemini Vision, GPT-4o Vision, Anthropic Claude Vision, AWS Textract, Azure Computer Vision, Tesseract (local fallback).
  • Experimental multimodal RAG for image, video, and audio — enabled via LibraryConfig.experimental_multimodal = True.
  • Audio transcription providers: AWS Transcribe, Google Speech-to-Text, OpenAI Whisper.

v1.0.0 2025-12-01 Legacy
Initial Release
  • Initial hybrid_rag implementation — a single-file proof-of-concept with a hardcoded pipeline.
  • Fixed provider stack: Gemini (embedder + generator), Neo4j (graph store), ChromaDB (vector store), spaCy (NLP extraction). No plugin system or swappable backends.
  • Text-only ingestion — PDF and plain text files only. No image, audio, or Office format support.
  • 4-channel hybrid retrieval with RRF — vector, graph subgraph, community, and bridge channels, fused with Reciprocal Rank Fusion.

Next Steps

Ready to upgrade or get started with v2.0.0?