Memory¶

1. Concept¶

The Memory module in Sikka Agent provides a flexible system for maintaining conversation context across interactions. It enables agents to remember previous exchanges, prioritize important information, and manage token usage efficiently. The memory system consists of two main components:

Memory: The core class that manages conversation history, prioritizes messages, and handles token limits.
Storage: Backend implementations that store and retrieve the actual data.

2. Get Started¶

2.1 Basic Usage¶

Here's a quick example of how to use the Memory class:

from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
from sikkaagent.agents.base import BaseMessage
from sikkaagent.utils.enums import RoleType

# Create a memory with in-memory storage
memory = Memory(storage=InMemoryStorage())

# Add a system message
system_message = BaseMessage(content="You are a helpful assistant.", role_name="System")
memory.add_message(system_message, RoleType.SYSTEM)

# Add a user message
user_message = BaseMessage(content="Hello, can you help me with Python?", role_name="User")
memory.add_message(user_message, RoleType.USER)

# Add an assistant message
assistant_message = BaseMessage(content="Of course! I'd be happy to help with Python. What specific question do you have?", role_name="Assistant")
memory.add_message(assistant_message, RoleType.ASSISTANT)

# Get context for the next interaction
context_messages, token_count = memory.get_context()
print(f"Retrieved {len(context_messages)} messages with {token_count} tokens")

Using Memory with a ChatAgent:

from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
from sikkaagent.agents import ChatAgent
from sikkaagent.models import ModelConfigure
from sikkaagent.utils.enums import ModelPlatformType

# Initialize model
model = ModelConfigure(
    model="llama3.1:8b",
    model_platform=ModelPlatformType.OLLAMA
)

# Create agent with memory
agent = ChatAgent(
    model=model,
    memory=InMemoryStorage(),  # Automatically wrapped in Memory
    system_prompt="You are a helpful assistant."
)

# First interaction
response = agent.step("My name is Alex and I'm learning about AI agents.")
print(response.msgs[0].content)

# Second interaction - the agent remembers the user's name
response = agent.step("What was my name again?")
print(response.msgs[0].content)  # The agent recalls that the user's name is Alex

3. Core Components¶

3.1 MemoryRecord¶

The basic data unit in Sikka Agent's memory system.

Attributes:¶

message: The main content of the record (BaseMessage)
role_type: The role type of the message (USER, ASSISTANT, SYSTEM, etc.)
uuid: A unique identifier for the record
extra_info: Additional key-value pairs for extra information

Methods:¶

from_message(): Static method to construct a MemoryRecord from a dictionary
to_message(): Convert the MemoryRecord to a dictionary for serialization
to_openai_message(): Convert the record to an OpenAI-compatible message format

3.2 ContextRecord¶

The result of memory retrieval, used to prioritize messages for context creation.

Attributes:¶

memory_record: A MemoryRecord
score: A float value representing the relevance or importance of the record

3.3 Memory¶

The central component of Sikka Agent's memory system.

Attributes:¶

storage: Storage backend for memory (implements the MemoryStorage protocol)
window_size: Maximum number of messages to keep (optional)
keep_rate: Priority factor for recent messages (0.0-1.0, default: 0.9)
token_counter: Function to count tokens in messages
token_limit: Maximum tokens for context (default: 4096)

Methods:¶

add_message(message, role_type): Add a message to memory
get_context(): Get context messages within token limit
clear(): Remove all messages from memory

4. Storage Options¶

Sikka Agent provides several storage backends for different use cases:

4.1 InMemoryStorage¶

The simplest storage backend that keeps all data in memory. This is the default storage option and is suitable for most use cases.

from sikkaagent.storages import InMemoryStorage
from sikkaagent.memories import Memory

# Create in-memory storage
storage = InMemoryStorage()

# Create memory with the storage
memory = Memory(storage=storage)

4.2 Vector Database Storage¶

For advanced retrieval capabilities and semantic search, Sikka Agent provides several vector database storage options.

4.2.1 Qdrant¶

from sikkaagent.storages import Qdrant
from sikkaagent.memories import Memory
from sikkaagent.utils.enums import Distance

# Create Qdrant storage
storage = Qdrant(
    collection_name="conversation_memory",  # Name of the collection
    vector_size=384,  # Dimension of the vectors
    distance=Distance.COSINE,  # Distance metric
    url=None,  # URL for remote Qdrant server (None for in-memory)
    api_key=None  # API key for remote Qdrant server
)

# Create memory with Qdrant storage
memory = Memory(storage=storage)

4.2.2 Other Vector Database Options¶

Sikka Agent also provides implementations for other vector databases:

ChromaDB
PineconeDB
SingleStore

5. Advanced Topics¶

5.1 Memory Configuration¶

5.1.1 Windowed Memory¶

Limit memory to a fixed number of recent messages:

from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage

# Create memory with window size limit
memory = Memory(
    storage=InMemoryStorage(),
    window_size=10  # Only keep the last 10 messages
)

# When conversation exceeds the window size, 
# oldest messages are removed automatically

5.1.2 Token-Limited Memory¶

Limit memory based on token count to stay within model context limits:

from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage
from sikkaagent.models import ModelConfigure
from sikkaagent.utils.enums import ModelPlatformType

# Initialize model for token counting
model = ModelConfigure(
    model="llama3.1:8b",
    model_platform=ModelPlatformType.OLLAMA
)

# Create memory with token limit
memory = Memory(
    storage=InMemoryStorage(),
    token_counter=model.token_counter,
    token_limit=4000  # Maximum tokens for context
)

# Memory will automatically prioritize and prune messages
# to stay within the 4000 token limit

5.2 Message Prioritization¶

The Memory class uses a scoring mechanism to prioritize messages when the token limit is reached:

System messages always have the highest priority (score = 1.0)
Other messages get decreasing scores based on recency and the keep_rate parameter
When the token limit is reached, messages with higher scores are kept

from sikkaagent.memories import Memory
from sikkaagent.storages import InMemoryStorage

# Create memory with custom keep_rate
memory = Memory(
    storage=InMemoryStorage(),
    keep_rate=0.8  # Lower value gives less priority to older messages
)

5.3 Performance Considerations¶

Storage Selection: Choose the appropriate storage backend based on your needs:
InMemoryStorage for simple applications without persistence
Vector database storage for semantic search capabilities
Token Management: Use window_size and token_limit to manage token usage in long conversations
System Messages: System messages are always kept in context with maximum priority, so keep them concise
Multiple Memories: Create separate memory instances for different conversation contexts

6. Implementation Details¶

6.1 Memory Protocol¶

The MemoryStorage protocol defines the interface that all storage backends must implement:

class MemoryStorage(Protocol):
    """Protocol for memory storage implementations"""
    def save(self, records: list[dict[str, Any]]) -> None:
        """Save records to storage"""
        ...

    def load(self) -> list[dict[str, Any]]:
        """Load records from storage"""
        ...

    def clear(self) -> None:
        """Clear all records from storage"""
        ...

6.2 Context Creation Process¶

The get_context() method follows these steps:

Load all records from storage
Apply window size limit if specified
Convert records to MemoryRecord objects
Score messages based on recency (using keep_rate)
Prioritize system messages (always kept with full score)
Select messages to include based on scores and token limit
Return the selected messages in chronological order

6.3 Image Handling¶

The Memory system has special handling for messages containing images:

Images in messages are preserved during serialization/deserialization
When converting to OpenAI format, images are properly encoded as base64 strings
Image detail level can be specified in the original message

7. Best Practices¶

System Messages: Keep system messages concise as they always have maximum priority
Token Efficiency: Use window_size and token_limit to manage token usage in long conversations
Prioritization: Adjust keep_rate (default 0.9) to change priority emphasis on recent messages
Topic Transitions: Use memory.clear() to reset conversation context when changing topics
Debugging: Inspect memory contents with context_messages, _ = memory.get_context()
Storage Selection: Choose the appropriate storage backend based on your needs:
InMemoryStorage for simple applications
Vector database storage for semantic search capabilities