Retrievers¶
Sikka Agent's Retrievers module enables semantic search and intelligent information retrieval, allowing agents to find relevant information based on meaning rather than just keywords.
Overview¶
Effective retrieval is critical for grounding AI responses in accurate, relevant information. The Retrievers module provides:
- Semantic search using vector embeddings
- Contextual information retrieval from various sources
- Hybrid search combining semantic and keyword approaches
- Reranking capabilities for improved relevance
- Multiple embedding model options (OpenAI, Sentence Transformers, Ollama, etc.)
These capabilities allow for implementing Retrieval Augmented Generation (RAG) patterns that enhance agent responses with specific information.
Embedding Models¶
Sikka Agent provides built-in support for various embedding models through the embedder
module:
Base Embedder¶
The foundational class for all embedding models in Sikka Agent.
from sikkaagent.retrievers.embedder.base import Embedder
class CustomEmbedder(Embedder):
dimensions: int = 768 # Set appropriate dimensions for your model
def get_embedding(self, text: str) -> list[float]:
# Implement embedding logic
pass
def get_embedding_and_usage(self, text: str) -> tuple[list[float], dict | None]:
# Implement embedding with usage tracking
pass
SentenceTransformerEmbedder¶
A wrapper for Sentence Transformers models that creates text embeddings.
Parameters¶
Parameter | Type | Description | Default |
---|---|---|---|
id |
str |
Name of pre-trained model | "sentence-transformers/all-MiniLM-L6-v2" |
dimensions |
int |
Embedding dimensions | 384 |
sentence_transformer_client |
SentenceTransformer |
Optional pre-initialized client | None |
Returns¶
get_embedding()
: Returns a vector embedding for textget_embedding_and_usage()
: Returns embeddings with usage information
Code Example¶
from sikkaagent.retrievers.embedder.sentence_transformer import SentenceTransformerEmbedder
# Create embeddings model with specific configuration
embedder = SentenceTransformerEmbedder(
id="sentence-transformers/multi-qa-mpnet-base-dot-v1"
)
# Create embeddings for a query
query_embedding = embedder.get_embedding("How do neural networks work?")
OpenAIEmbedder¶
A wrapper for OpenAI's embedding models.
Parameters¶
Parameter | Type | Description | Default |
---|---|---|---|
id |
str |
Model ID | "text-embedding-3-small" |
dimensions |
int |
Embedding dimensions | 1536 |
encoding_format |
str |
Output format | "float" |
api_key |
str |
OpenAI API key | None |
organization |
str |
OpenAI organization | None |
base_url |
str |
API base URL | None |
Returns¶
get_embedding()
: Returns a vector embedding for textget_embedding_and_usage()
: Returns embeddings with token usage information
Code Example¶
from sikkaagent.retrievers.embedder.openai import OpenAIEmbedder
# Create OpenAI embedder
embedder = OpenAIEmbedder(
id="text-embedding-3-small",
dimensions=1536,
api_key="your-api-key"
)
# Generate embedding
embedding = embedder.get_embedding("What is machine learning?")
# Get embedding with usage statistics
embedding, usage = embedder.get_embedding_and_usage("What is machine learning?")
print(f"Tokens used: {usage}")
Other Embedders¶
Sikka Agent also provides:
- OllamaEmbedder: For using Ollama's embedding models
- FastEmbedEmbedder: For using the FastEmbed library
Rerankers¶
Sikka Agent provides reranking capabilities through the reranker
module to improve retrieval quality:
Base Reranker¶
The foundational class for all rerankers in Sikka Agent.