Storages¶
For more detailed usage information please refer to our cookbook: Memory Cookbook
1. Concept¶
The Storage module in Sikka Agent provides flexible backends for persisting data, from simple in-memory caches to sophisticated vector databases. These storage options enable agents to maintain context, knowledge bases, and conversation history across sessions.
Sikka Agent's storage system consists of two main types:
- Basic Storage: Simple key-value storage for general-purpose data persistence
- Vector Storage: Specialized storage for vector embeddings that enables semantic search
2. Get Started¶
2.1 Basic Usage¶
Here's a quick example of how to use the InMemoryStorage
class:
from sikkaagent.storages import InMemoryStorage
from sikkaagent.memories import Memory
# Create in-memory storage
storage = InMemoryStorage()
# Create memory with the storage
memory = Memory(storage=storage)
# Add data to memory
from sikkaagent.agents.base import BaseMessage
from sikkaagent.utils.enums import RoleType
message = BaseMessage(content="Hello, world!", role_name="User")
memory.add_message(message, RoleType.USER)
# Get context
context_messages, token_count = memory.get_context()
# Clear storage
storage.clear()
Using Qdrant vector storage:
from sikkaagent.storages import Qdrant
from sikkaagent.utils.enums import Distance
# Create Qdrant storage
storage = Qdrant(
collection_name="conversation_memory", # Name of the collection
vector_size=384, # Dimension of the vectors
distance=Distance.COSINE # Distance metric
)
# Use with Memory
from sikkaagent.memories import Memory
memory = Memory(storage=storage)
3. Core Components¶
3.1 BaseStorage¶
The abstract base class for all basic storage implementations.
Methods:¶
save(records)
: Save records to storageload()
: Load records from storageclear()
: Clear all records from storage
3.2 BaseVectorStorage¶
The abstract base class for all vector storage implementations.
Methods:¶
add(records)
: Add vector records to storagedelete(ids)
: Delete vectors by IDsstatus()
: Get status of the vector databasequery(query)
: Search for similar vectorsclear()
: Remove all vectors from storageload()
: Load the collection from cloud serviceclient
: Property that provides access to the underlying database client
3.3 VectorRecord¶
Encapsulates information about a vector's unique identifier and its payload.
Attributes:¶
vector
: The numerical representation of the vectorid
: A unique identifier for the vectorpayload
: Additional metadata related to the vector
3.4 VectorDBQuery¶
Represents a query to a vector database.
Attributes:¶
query_vector
: The numerical representation of the query vectortop_k
: The number of top similar vectors to retrieve
3.5 VectorDBQueryResult¶
Encapsulates the result of a query against a vector database.
Attributes:¶
record
: The target vector recordsimilarity
: The similarity score between the query vector and the record
4. Storage Implementations¶
4.1 InMemoryStorage¶
The simplest storage backend that keeps all data in memory. This is the default storage option and is suitable for most use cases.
from sikkaagent.storages import InMemoryStorage
# Create in-memory storage
storage = InMemoryStorage()
# Save records
storage.save([{"key": "value"}])
# Load records
records = storage.load()
# Clear storage
storage.clear()
4.2 Qdrant (New Implementation)¶
The new preferred implementation of Qdrant vector database storage.
from sikkaagent.storages import Qdrant
from sikkaagent.utils.enums import Distance
# Create Qdrant storage
storage = Qdrant(
collection_name="conversation_memory", # Name of the collection
vector_size=384, # Dimension of the vectors
distance=Distance.COSINE, # Distance metric
url=None, # URL for remote Qdrant server (None for in-memory)
api_key=None # API key for remote Qdrant server
)
# Use with embeddings and vectors
# (Implementation details may vary from the legacy QdrantStorage)
4.3 Other Vector Database Options¶
Sikka Agent also provides implementations for other vector databases in the vectordb
package:
4.3.1 ChromaDB¶
from sikkaagent.storages.vectordb.chroma.chromadb import ChromaDb
# Create ChromaDB storage
storage = ChromaDb(
collection_name="conversation_memory", # Name of the collection
vector_size=384, # Dimension of the vectors
persist_directory=None # Directory for persistent storage (None for in-memory)
)
4.3.2 PineconeDB¶
from sikkaagent.storages.vectordb.pineconedb.pineconedb import PineconeDb
# Create PineconeDB storage
storage = PineconeDb(
index_name="conversation_memory", # Name of the index
vector_size=384, # Dimension of the vectors
api_key="your-api-key", # Pinecone API key
environment="us-west1-gcp" # Pinecone environment
)
4.3.3 SingleStore¶
from sikkaagent.storages.vectordb.singlestore.singlestore import SingleStore
# Create SingleStore storage
storage = SingleStore(
table_name="conversation_memory", # Name of the table
vector_size=384, # Dimension of the vectors
connection_string="mysql://user:password@host:port/database" # Connection string
)
4.4 QdrantStorage (Legacy)¶
The legacy implementation of Qdrant vector database storage.
from sikkaagent.storages import QdrantStorage
from sikkaagent.utils.enums import VectorDistance
# Create Qdrant storage
storage = QdrantStorage(
vector_dim=384, # Dimension of the vectors
collection_name="conversation_memory", # Name of the collection
distance=VectorDistance.COSINE, # Distance metric
path=None, # Path for local storage (None for in-memory)
delete_collection_on_del=False # Whether to delete collection on object destruction
)
# Add vectors
from sikkaagent.storages import VectorRecord
records = [
VectorRecord(
vector=[0.1, 0.2, 0.3, ...], # 384-dimensional vector
id="record_1",
payload={"content": "Hello, world!"}
)
]
storage.add(records)
# Query vectors
from sikkaagent.storages import VectorDBQuery
query = VectorDBQuery(
query_vector=[0.1, 0.2, 0.3, ...], # 384-dimensional vector
top_k=3
)
results = storage.query(query)
# Get status
status = storage.status()
print(f"Vector dimension: {status.vector_dim}")
print(f"Vector count: {status.vector_count}")
# Clear storage
storage.clear()
5. Integration with Other Modules¶
5.1 With Memory¶
Storage backends are primarily used with the Memory system:
from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
# Create memory with storage
memory = Memory(storage=InMemoryStorage())
5.2 With ChatAgent¶
The ChatAgent
class has special handling for the memory
parameter:
from sikkaagent.agents import ChatAgent
from sikkaagent.models import ModelConfigure
from sikkaagent.utils.enums import ModelPlatformType
from sikkaagent.storages import InMemoryStorage
# Create agent with storage directly
agent = ChatAgent(
model=ModelConfigure(
model="llama3.1:8b",
model_platform=ModelPlatformType.OLLAMA
),
memory=InMemoryStorage(), # Automatically wrapped in Memory
system_prompt="You are a helpful assistant."
)
6. Advanced Topics¶
6.1 Vector Database Configuration¶
When using vector databases, you can configure various parameters:
from sikkaagent.storages import Qdrant
from sikkaagent.utils.enums import Distance
# Create Qdrant with custom configuration
storage = Qdrant(
collection_name="custom_collection",
vector_size=768, # Larger dimension for more complex embeddings
distance=Distance.EUCLIDEAN, # Different distance metric
url="https://your-qdrant-server.com", # Remote server
api_key="your-api-key" # Authentication
)
6.2 Filtering Vector Queries¶
You can filter vector queries based on metadata:
from sikkaagent.storages import QdrantStorage, VectorDBQuery
# Create query
query = VectorDBQuery(
query_vector=[0.1, 0.2, 0.3, ...],
top_k=5
)
# Execute query with filter
results = storage.query(
query=query,
filter_conditions={"category": "conversation"}
)
6.3 Performance Considerations¶
- Storage Selection: Choose the appropriate storage backend based on your needs:
InMemoryStorage
for simple applications without persistence-
Vector database storage for semantic search capabilities
-
Vector Dimensions: Be consistent with embedding dimensions across your application
-
Persistence Strategy: Plan for data persistence based on application needs:
- Ephemeral: Use in-memory storage for testing or stateless applications
- Persistent: Use file-based or database storage for production
7. Best Practices¶
- Storage Selection: Choose the right storage type for your use case:
- InMemoryStorage: For temporary, session-based data
-
QdrantStorage/Qdrant: For semantic search and RAG applications
-
Vector Dimensions: Ensure your embedding model's output dimension matches the vector_size/vector_dim parameter
-
Collection Naming: Use descriptive collection names to organize your vector data
-
Error Handling: Implement proper error handling for storage operations, especially for remote vector databases
-
Cleanup: Use the
clear()
method to clean up storage when needed, and consider settingdelete_collection_on_del=True
for temporary collections