Skip to content

Retrievers

Sikka Agent's Retrievers module enables semantic search and intelligent information retrieval, allowing agents to find relevant information based on meaning rather than just keywords.

Overview

Effective retrieval is critical for grounding AI responses in accurate, relevant information. The Retrievers module provides:

  • Semantic search using vector embeddings
  • Contextual information retrieval from various sources
  • Hybrid search combining semantic and keyword approaches
  • Reranking capabilities for improved relevance
  • Multiple embedding model options (OpenAI, Sentence Transformers, Ollama, etc.)

These capabilities allow for implementing Retrieval Augmented Generation (RAG) patterns that enhance agent responses with specific information.

Embedding Models

Sikka Agent provides built-in support for various embedding models through the embedder module:

Base Embedder

The foundational class for all embedding models in Sikka Agent.

from sikkaagent.retrievers.embedder.base import Embedder

class CustomEmbedder(Embedder):
    dimensions: int = 768  # Set appropriate dimensions for your model

    def get_embedding(self, text: str) -> list[float]:
        # Implement embedding logic
        pass

    def get_embedding_and_usage(self, text: str) -> tuple[list[float], dict | None]:
        # Implement embedding with usage tracking
        pass

SentenceTransformerEmbedder

A wrapper for Sentence Transformers models that creates text embeddings.

Parameters

Parameter Type Description Default
id str Name of pre-trained model "sentence-transformers/all-MiniLM-L6-v2"
dimensions int Embedding dimensions 384
sentence_transformer_client SentenceTransformer Optional pre-initialized client None

Returns

  • get_embedding(): Returns a vector embedding for text
  • get_embedding_and_usage(): Returns embeddings with usage information

Code Example

from sikkaagent.retrievers.embedder.sentence_transformer import SentenceTransformerEmbedder

# Create embeddings model with specific configuration
embedder = SentenceTransformerEmbedder(
    id="sentence-transformers/multi-qa-mpnet-base-dot-v1"
)

# Create embeddings for a query
query_embedding = embedder.get_embedding("How do neural networks work?")

OpenAIEmbedder

A wrapper for OpenAI's embedding models.

Parameters

Parameter Type Description Default
id str Model ID "text-embedding-3-small"
dimensions int Embedding dimensions 1536
encoding_format str Output format "float"
api_key str OpenAI API key None
organization str OpenAI organization None
base_url str API base URL None

Returns

  • get_embedding(): Returns a vector embedding for text
  • get_embedding_and_usage(): Returns embeddings with token usage information

Code Example

from sikkaagent.retrievers.embedder.openai import OpenAIEmbedder

# Create OpenAI embedder
embedder = OpenAIEmbedder(
    id="text-embedding-3-small",
    dimensions=1536,
    api_key="your-api-key"
)

# Generate embedding
embedding = embedder.get_embedding("What is machine learning?")

# Get embedding with usage statistics
embedding, usage = embedder.get_embedding_and_usage("What is machine learning?")
print(f"Tokens used: {usage}")

Other Embedders

Sikka Agent also provides:

  • OllamaEmbedder: For using Ollama's embedding models
  • FastEmbedEmbedder: For using the FastEmbed library

Rerankers

Sikka Agent provides reranking capabilities through the reranker module to improve retrieval quality:

Base Reranker

The foundational class for all rerankers in Sikka Agent.

from sikkaagent.retrievers.reranker.base import Reranker
from sikkaagent.document import Document

class CustomReranker(Reranker):
    def rerank(self, query: str, documents: list[Document]) -> list[Document]:
        # Implement reranking logic
        # Sort documents by relevance to the query
        return sorted_documents