Generators

LLM generation providers — take a question and retrieved context, and produce a grounded natural-language answer.

BaseGenerator ABC

All LLM providers implement the BaseGenerator abstract base class. The core contract is a single generate() call that combines a user question with retrieved context passages.

python
from abc import ABC, abstractmethod

class BaseGenerator(ABC):
    @abstractmethod
    def generate(self, question: str, context: str) -> str: ...

    def generate_with_system(
        self, question: str, context: str, system_prompt: str
    ) -> str: ...
Method Signature Description
generate (question: str, context: str) -> str Core RAG generation call. context contains retrieved passages; the model answers question grounded in that context.
generate_with_system (question: str, context: str, system_prompt: str) -> str Same as generate but prepends a custom system prompt. Useful for domain-specific personas or format constraints. Provided as a default mixin; override for providers that handle system prompts differently.

Providers

Choose a generation provider by passing its key to the llm= argument of RAGLibrary(). The default provider is gemini.

Key Class Install extra Default model Notes
gemini DEFAULT GeminiGenerator cognity-ai[gemini] gemini-2.0-flash Fast, cheap, multimodal
vertex_ai VertexAIGenerator cognity-ai[vertex-ai] gemini-1.5-pro Google Cloud enterprise
openai OpenAIGenerator cognity-ai[openai] gpt-4o High quality; also gpt-4o-mini
azure_openai AzureOpenAIGenerator cognity-ai[azure] Azure GPT-4o Azure-hosted GPT models
anthropic AnthropicGenerator cognity-ai[anthropic] claude-3-5-sonnet-20241022 Strong reasoning; also claude-3-7-sonnet
bedrock BedrockGenerator cognity-ai[bedrock] anthropic.claude-3-5-sonnet-20241022-v2:0 AWS-native; also Titan, Llama, Mistral
cohere CohereGenerator cognity-ai[cohere] command-r-plus Retrieval-augmented generation optimized
ollama OllamaGenerator cognity-ai[ollama] llama3 Local; any Ollama-supported model

Per-Provider Examples

Quick setup snippets for common providers via the RAGLibrary constructor.

python
from cognity-ai import RAGLibrary

# Anthropic (Claude)
rag = RAGLibrary(llm="anthropic", anthropic_api_key="sk-ant-...")

# Azure OpenAI
rag = RAGLibrary(
    llm="azure_openai",
    azure_openai_endpoint="https://myinstance.openai.azure.com/",
    azure_openai_key="...",
)

# Bedrock (AWS)
rag = RAGLibrary(llm="bedrock", aws_region="us-east-1")

# Per-query LLM override (retriever stays same, only generation changes)
answer = rag.query("Explain this", method="naive")
ℹ️
AWS credential chain The bedrock provider uses the standard boto3 credential resolution order: IAM instance role, environment variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY), then ~/.aws/credentials. No explicit API key is required when running inside AWS.

Custom Model Selection

For fine-grained control over model version, token limits, and temperature, pass a typed config object instead of a key string.

python
from cognity_ai.config import LibraryConfig, AnthropicConfig

config = LibraryConfig(
    llm="anthropic",
    anthropic=AnthropicConfig(
        api_key="sk-ant-...",
        model="claude-3-7-sonnet-20250219",
        max_tokens=8192,
        temperature=0.0,
    ),
)
rag = RAGLibrary(config=config)

Each provider has a corresponding config dataclass (GeminiConfig, OpenAIConfig, BedrockConfig, etc.) with provider-specific fields. Refer to the cognity_ai.config module for the full set of available fields per provider.

💡
Temperature 0.0 for RAG Setting temperature=0.0 is strongly recommended for production RAG pipelines. It removes stochastic variation from answers, making responses deterministic and easier to evaluate and debug.