Memory¶
1. Concept¶
The Memory module in Sikka Agent provides a flexible system for maintaining conversation context across interactions. It enables agents to remember previous exchanges, prioritize important information, and manage token usage efficiently. The memory system consists of two main components:
- Memory: The core class that manages conversation history, prioritizes messages, and handles token limits.
- Storage: Backend implementations that store and retrieve the actual data.
2. Get Started¶
2.1 Basic Usage¶
Here's a quick example of how to use the Memory
class:
from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
from sikkaagent.agents.base import BaseMessage
from sikkaagent.utils.enums import RoleType
# Create a memory with in-memory storage
memory = Memory(storage=InMemoryStorage())
# Add a system message
system_message = BaseMessage(content="You are a helpful assistant.", role_name="System")
memory.add_message(system_message, RoleType.SYSTEM)
# Add a user message
user_message = BaseMessage(content="Hello, can you help me with Python?", role_name="User")
memory.add_message(user_message, RoleType.USER)
# Add an assistant message
assistant_message = BaseMessage(content="Of course! I'd be happy to help with Python. What specific question do you have?", role_name="Assistant")
memory.add_message(assistant_message, RoleType.ASSISTANT)
# Get context for the next interaction
context_messages, token_count = memory.get_context()
print(f"Retrieved {len(context_messages)} messages with {token_count} tokens")
Using Memory with a ChatAgent:
from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
from sikkaagent.agents import ChatAgent
from sikkaagent.models import ModelConfigure
from sikkaagent.utils.enums import ModelPlatformType
# Initialize model
model = ModelConfigure(
model="llama3.1:8b",
model_platform=ModelPlatformType.OLLAMA
)
# Create agent with memory
agent = ChatAgent(
model=model,
memory=InMemoryStorage(), # Automatically wrapped in Memory
system_prompt="You are a helpful assistant."
)
# First interaction
response = agent.step("My name is Alex and I'm learning about AI agents.")
print(response.msgs[0].content)
# Second interaction - the agent remembers the user's name
response = agent.step("What was my name again?")
print(response.msgs[0].content) # The agent recalls that the user's name is Alex
3. Core Components¶
3.1 MemoryRecord¶
The basic data unit in Sikka Agent's memory system.
Attributes:¶
message
: The main content of the record (BaseMessage
)role_type
: The role type of the message (USER, ASSISTANT, SYSTEM, etc.)uuid
: A unique identifier for the recordextra_info
: Additional key-value pairs for extra information
Methods:¶
from_message()
: Static method to construct aMemoryRecord
from a dictionaryto_message()
: Convert theMemoryRecord
to a dictionary for serializationto_openai_message()
: Convert the record to an OpenAI-compatible message format
3.2 ContextRecord¶
The result of memory retrieval, used to prioritize messages for context creation.
Attributes:¶
memory_record
: AMemoryRecord
score
: A float value representing the relevance or importance of the record
3.3 Memory¶
The central component of Sikka Agent's memory system.
Attributes:¶
storage
: Storage backend for memory (implements theMemoryStorage
protocol)window_size
: Maximum number of messages to keep (optional)keep_rate
: Priority factor for recent messages (0.0-1.0, default: 0.9)token_counter
: Function to count tokens in messagestoken_limit
: Maximum tokens for context (default: 4096)
Methods:¶
add_message(message, role_type)
: Add a message to memoryget_context()
: Get context messages within token limitclear()
: Remove all messages from memory
4. Storage Options¶
Sikka Agent provides several storage backends for different use cases:
4.1 InMemoryStorage¶
The simplest storage backend that keeps all data in memory. This is the default storage option and is suitable for most use cases.
from sikkaagent.storages import InMemoryStorage
from sikkaagent.memories import Memory
# Create in-memory storage
storage = InMemoryStorage()
# Create memory with the storage
memory = Memory(storage=storage)
4.2 Vector Database Storage¶
For advanced retrieval capabilities and semantic search, Sikka Agent provides several vector database storage options.
4.2.1 Qdrant¶
from sikkaagent.storages import Qdrant
from sikkaagent.memories import Memory
from sikkaagent.utils.enums import Distance
# Create Qdrant storage
storage = Qdrant(
collection_name="conversation_memory", # Name of the collection
vector_size=384, # Dimension of the vectors
distance=Distance.COSINE, # Distance metric
url=None, # URL for remote Qdrant server (None for in-memory)
api_key=None # API key for remote Qdrant server
)
# Create memory with Qdrant storage
memory = Memory(storage=storage)
4.2.2 Other Vector Database Options¶
Sikka Agent also provides implementations for other vector databases:
- ChromaDB
- PineconeDB
- SingleStore
5. Advanced Topics¶
5.1 Memory Configuration¶
5.1.1 Windowed Memory¶
Limit memory to a fixed number of recent messages:
from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
# Create memory with window size limit
memory = Memory(
storage=InMemoryStorage(),
window_size=10 # Only keep the last 10 messages
)
# When conversation exceeds the window size,
# oldest messages are removed automatically
5.1.2 Token-Limited Memory¶
Limit memory based on token count to stay within model context limits:
from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
from sikkaagent.models import ModelConfigure
from sikkaagent.utils.enums import ModelPlatformType
# Initialize model for token counting
model = ModelConfigure(
model="llama3.1:8b",
model_platform=ModelPlatformType.OLLAMA
)
# Create memory with token limit
memory = Memory(
storage=InMemoryStorage(),
token_counter=model.token_counter,
token_limit=4000 # Maximum tokens for context
)
# Memory will automatically prioritize and prune messages
# to stay within the 4000 token limit
5.2 Message Prioritization¶
The Memory class uses a scoring mechanism to prioritize messages when the token limit is reached:
- System messages always have the highest priority (score = 1.0)
- Other messages get decreasing scores based on recency and the
keep_rate
parameter - When the token limit is reached, messages with higher scores are kept
from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
# Create memory with custom keep_rate
memory = Memory(
storage=InMemoryStorage(),
keep_rate=0.8 # Lower value gives less priority to older messages
)
5.3 Performance Considerations¶
- Storage Selection: Choose the appropriate storage backend based on your needs:
InMemoryStorage
for simple applications without persistence-
Vector database storage for semantic search capabilities
-
Token Management: Use
window_size
andtoken_limit
to manage token usage in long conversations -
System Messages: System messages are always kept in context with maximum priority, so keep them concise
-
Multiple Memories: Create separate memory instances for different conversation contexts
6. Implementation Details¶
6.1 Memory Protocol¶
The MemoryStorage
protocol defines the interface that all storage backends must implement:
class MemoryStorage(Protocol):
"""Protocol for memory storage implementations"""
def save(self, records: list[dict[str, Any]]) -> None:
"""Save records to storage"""
...
def load(self) -> list[dict[str, Any]]:
"""Load records from storage"""
...
def clear(self) -> None:
"""Clear all records from storage"""
...
6.2 Context Creation Process¶
The get_context()
method follows these steps:
- Load all records from storage
- Apply window size limit if specified
- Convert records to
MemoryRecord
objects - Score messages based on recency (using
keep_rate
) - Prioritize system messages (always kept with full score)
- Select messages to include based on scores and token limit
- Return the selected messages in chronological order
6.3 Image Handling¶
The Memory system has special handling for messages containing images:
- Images in messages are preserved during serialization/deserialization
- When converting to OpenAI format, images are properly encoded as base64 strings
- Image detail level can be specified in the original message
7. Best Practices¶
- System Messages: Keep system messages concise as they always have maximum priority
- Token Efficiency: Use
window_size
andtoken_limit
to manage token usage in long conversations - Prioritization: Adjust
keep_rate
(default 0.9) to change priority emphasis on recent messages - Topic Transitions: Use
memory.clear()
to reset conversation context when changing topics - Debugging: Inspect memory contents with
context_messages, _ = memory.get_context()
- Storage Selection: Choose the appropriate storage backend based on your needs:
InMemoryStorage
for simple applications- Vector database storage for semantic search capabilities