API Reference

Generators

LLM generation providers — take a question and retrieved context, and produce a grounded natural-language answer.

BaseGenerator ABC

All LLM providers implement the BaseGenerator abstract base class. The core contract is a single generate() call that combines a user question with retrieved context passages.

python

from abc import ABC, abstractmethod

class BaseGenerator(ABC):
    @abstractmethod
    def generate(self, question: str, context: str) -> str: ...

    def generate_with_system(
        self, question: str, context: str, system_prompt: str
    ) -> str: ...

Method	Signature	Description
`generate`	`(question: str, context: str) -> str`	Core RAG generation call. `context` contains retrieved passages; the model answers `question` grounded in that context.
`generate_with_system`	`(question: str, context: str, system_prompt: str) -> str`	Same as `generate` but prepends a custom system prompt. Useful for domain-specific personas or format constraints. Provided as a default mixin; override for providers that handle system prompts differently.

Providers

Choose a generation provider by passing its key to the llm= argument of RAGLibrary(). The default provider is gemini.

Key	Class	Install extra	Default model	Notes
`gemini` DEFAULT	`GeminiGenerator`	`cognity-ai[gemini]`	`gemini-2.0-flash`	Fast, cheap, multimodal
`vertex_ai`	`VertexAIGenerator`	`cognity-ai[vertex-ai]`	`gemini-1.5-pro`	Google Cloud enterprise
`openai`	`OpenAIGenerator`	`cognity-ai[openai]`	`gpt-4o`	High quality; also `gpt-4o-mini`
`azure_openai`	`AzureOpenAIGenerator`	`cognity-ai[azure]`	Azure GPT-4o	Azure-hosted GPT models
`anthropic`	`AnthropicGenerator`	`cognity-ai[anthropic]`	`claude-3-5-sonnet-20241022`	Strong reasoning; also `claude-3-7-sonnet`
`bedrock`	`BedrockGenerator`	`cognity-ai[bedrock]`	`anthropic.claude-3-5-sonnet-20241022-v2:0`	AWS-native; also Titan, Llama, Mistral
`cohere`	`CohereGenerator`	`cognity-ai[cohere]`	`command-r-plus`	Retrieval-augmented generation optimized
`ollama`	`OllamaGenerator`	`cognity-ai[ollama]`	`llama3`	Local; any Ollama-supported model

Per-Provider Examples

Quick setup snippets for common providers via the RAGLibrary constructor.

python

from cognity-ai import RAGLibrary

# Anthropic (Claude)
rag = RAGLibrary(llm="anthropic", anthropic_api_key="sk-ant-...")

# Azure OpenAI
rag = RAGLibrary(
    llm="azure_openai",
    azure_openai_endpoint="https://myinstance.openai.azure.com/",
    azure_openai_key="...",
)

# Bedrock (AWS)
rag = RAGLibrary(llm="bedrock", aws_region="us-east-1")

# Per-query LLM override (retriever stays same, only generation changes)
answer = rag.query("Explain this", method="naive")

ℹ️

AWS credential chain The bedrock provider uses the standard boto3 credential resolution order: IAM instance role, environment variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY), then ~/.aws/credentials. No explicit API key is required when running inside AWS.

Custom Model Selection

For fine-grained control over model version, token limits, and temperature, pass a typed config object instead of a key string.

python

from cognity_ai.config import LibraryConfig, AnthropicConfig

config = LibraryConfig(
    llm="anthropic",
    anthropic=AnthropicConfig(
        api_key="sk-ant-...",
        model="claude-3-7-sonnet-20250219",
        max_tokens=8192,
        temperature=0.0,
    ),
)
rag = RAGLibrary(config=config)

Each provider has a corresponding config dataclass (GeminiConfig, OpenAIConfig, BedrockConfig, etc.) with provider-specific fields. Refer to the cognity_ai.config module for the full set of available fields per provider.

💡

Temperature 0.0 for RAG Setting temperature=0.0 is strongly recommended for production RAG pipelines. It removes stochastic variation from answers, making responses deterministic and easier to evaluate and debug.