Generators
LLM generation providers — take a question and retrieved context, and produce a grounded natural-language answer.
BaseGenerator ABC
All LLM providers implement the BaseGenerator abstract base class. The core contract is a single generate() call that combines a user question with retrieved context passages.
from abc import ABC, abstractmethod
class BaseGenerator(ABC):
@abstractmethod
def generate(self, question: str, context: str) -> str: ...
def generate_with_system(
self, question: str, context: str, system_prompt: str
) -> str: ...
| Method | Signature | Description |
|---|---|---|
generate |
(question: str, context: str) -> str |
Core RAG generation call. context contains retrieved passages; the model answers question grounded in that context. |
generate_with_system |
(question: str, context: str, system_prompt: str) -> str |
Same as generate but prepends a custom system prompt. Useful for domain-specific personas or format constraints. Provided as a default mixin; override for providers that handle system prompts differently. |
Providers
Choose a generation provider by passing its key to the llm= argument of RAGLibrary(). The default provider is gemini.
| Key | Class | Install extra | Default model | Notes |
|---|---|---|---|---|
gemini DEFAULT |
GeminiGenerator |
cognity-ai[gemini] |
gemini-2.0-flash |
Fast, cheap, multimodal |
vertex_ai |
VertexAIGenerator |
cognity-ai[vertex-ai] |
gemini-1.5-pro |
Google Cloud enterprise |
openai |
OpenAIGenerator |
cognity-ai[openai] |
gpt-4o |
High quality; also gpt-4o-mini |
azure_openai |
AzureOpenAIGenerator |
cognity-ai[azure] |
Azure GPT-4o | Azure-hosted GPT models |
anthropic |
AnthropicGenerator |
cognity-ai[anthropic] |
claude-3-5-sonnet-20241022 |
Strong reasoning; also claude-3-7-sonnet |
bedrock |
BedrockGenerator |
cognity-ai[bedrock] |
anthropic.claude-3-5-sonnet-20241022-v2:0 |
AWS-native; also Titan, Llama, Mistral |
cohere |
CohereGenerator |
cognity-ai[cohere] |
command-r-plus |
Retrieval-augmented generation optimized |
ollama |
OllamaGenerator |
cognity-ai[ollama] |
llama3 |
Local; any Ollama-supported model |
Per-Provider Examples
Quick setup snippets for common providers via the RAGLibrary constructor.
from cognity-ai import RAGLibrary
# Anthropic (Claude)
rag = RAGLibrary(llm="anthropic", anthropic_api_key="sk-ant-...")
# Azure OpenAI
rag = RAGLibrary(
llm="azure_openai",
azure_openai_endpoint="https://myinstance.openai.azure.com/",
azure_openai_key="...",
)
# Bedrock (AWS)
rag = RAGLibrary(llm="bedrock", aws_region="us-east-1")
# Per-query LLM override (retriever stays same, only generation changes)
answer = rag.query("Explain this", method="naive")
bedrock provider uses the standard boto3 credential resolution order: IAM instance role, environment variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY), then ~/.aws/credentials. No explicit API key is required when running inside AWS.
Custom Model Selection
For fine-grained control over model version, token limits, and temperature, pass a typed config object instead of a key string.
from cognity_ai.config import LibraryConfig, AnthropicConfig
config = LibraryConfig(
llm="anthropic",
anthropic=AnthropicConfig(
api_key="sk-ant-...",
model="claude-3-7-sonnet-20250219",
max_tokens=8192,
temperature=0.0,
),
)
rag = RAGLibrary(config=config)
Each provider has a corresponding config dataclass (GeminiConfig, OpenAIConfig, BedrockConfig, etc.) with provider-specific fields. Refer to the cognity_ai.config module for the full set of available fields per provider.
temperature=0.0 is strongly recommended for production RAG pipelines. It removes stochastic variation from answers, making responses deterministic and easier to evaluate and debug.