Memory
Short-term memory, conversation stores, vector stores, and user memory.
Short-term memory
ShortTermMemory
class ShortTermMemory(max_tokens: int = 40000)Sliding-window conversation buffer for a single agent run.
Stores the ordered message history and automatically trims to fit within a
token budget when requested. Uses the heuristic chars / 4 ≈ tokens
when no token-counting transport is available.
Example:
memory = ShortTermMemory(max_tokens=8000)
memory.add_user("Hello, how are you?")
memory.add_assistant(completion)
msgs = memory.messages() # trimmed to budgetParameters:
| Name | Type | Description |
|---|---|---|
max_tokens | int | Maximum number of tokens to retain in the window. Defaults to 40 000. |
ShortTermMemory.add_user
def add_user(self, content: str | list[Any]) -> NoneAppend a user message to the buffer.
Parameters:
| Name | Type | Description |
|---|---|---|
content | str | list[Any] | Plain text string or list of content blocks. |
ShortTermMemory.add_assistant
def add_assistant(self, completion: Any) -> NoneAppend an assistant completion to the buffer.
Accepts a Completion dataclass (with .content and
.tool_calls attributes) or a plain dict.
Parameters:
| Name | Type | Description |
|---|---|---|
completion | Any | A Completion object or {"role": "assistant", "content": "..."} dict. |
ShortTermMemory.add_tool_result
def add_tool_result(self, result: Any) -> NoneAppend a tool result message to the buffer.
Accepts a ToolResult dataclass or a plain dict.
Parameters:
| Name | Type | Description |
|---|---|---|
result | Any | A ToolResult object or dict. |
ShortTermMemory.set_summary
def set_summary(self, text: str) -> NoneStore text as the conversation summary.
Called by the runner after a summarisation LLM call completes.
The summary is persisted via snapshot() / restore() so
resumed sessions carry it forward.
Parameters:
| Name | Type | Description |
|---|---|---|
text | str | Compressed summary of older conversation turns. |
ShortTermMemory.messages_to_summarize
def messages_to_summarize(self, keep_recent: int = 6) -> list[Any]Return the slice of messages that should be compressed.
Returns the oldest (total - keep_recent) non-system messages.
System messages are excluded because they are already managed
separately (they are never dropped by messages() either).
Parameters:
| Name | Type | Description |
|---|---|---|
keep_recent | int | Number of most-recent non-system messages to preserve verbatim. Defaults to 6 (≈ 3 user/assistant pairs). |
Returns: list[Any] — List of messages to feed to the summarisation LLM call.
ShortTermMemory.trim_to_recent
def trim_to_recent(self, keep_recent: int = 6) -> NoneDrop all but the most-recent keep_recent non-system messages.
Called by the runner after the summarisation call so the buffer
only holds recent turns while the older context lives in
self._summary.
Parameters:
| Name | Type | Description |
|---|---|---|
keep_recent | int | Number of most-recent non-system messages to keep. Defaults to 6. |
ShortTermMemory.messages
def messages(self) -> list[Any]Return the current message list, trimmed to fit the token window.
The trim is applied in-place on a copy; the internal buffer is NOT
modified. Call trim_to_fit() explicitly to mutate the buffer.
Returns: list[Message] — Ordered list of messages within the token budget.
ShortTermMemory.trim_to_fit
def trim_to_fit(self, max_tokens: int) -> NoneDrop oldest non-system messages until the token estimate fits.
Unlike messages() this mutates the internal buffer.
Parameters:
| Name | Type | Description |
|---|---|---|
max_tokens | int | Target token budget. |
ShortTermMemory.clear
def clear(self) -> NoneClear all messages from the buffer.
Returns: None — None
ShortTermMemory.snapshot
def snapshot(self) -> AnyReturn a deep copy of the current memory state.
The returned object includes both the message list and the conversation summary (if any). It is independent of the internal buffer — mutations do not affect the memory.
The format is a dict with "messages" and "summary" keys
so that resumed sessions carry the summary forward. Old snapshots
that are plain list objects are still accepted by restore()
for backward compatibility.
Returns: dict[str, Any] — Snapshot dict {"messages": [...], "summary": str | None}.
ShortTermMemory.restore
def restore(self, data: Any) -> NoneRestore the memory buffer from a snapshot.
Accepts both the new dict snapshot format ({"messages": [...], "summary": ...}) and the legacy plain list format produced by
older versions of snapshot().
Parameters:
| Name | Type | Description |
|---|---|---|
data | Any | Snapshot produced by snapshot(), or a plain list of message objects for backward compatibility. |
Conversation store
ConversationStore
class ConversationStoreProtocol for persisting and retrieving full conversation histories.
Keyed by an arbitrary string conversation_id (typically a session or
user identifier).
ConversationStore.load
def load(self, conversation_id: str) -> list[Any]Load the message history for conversation_id.
Parameters:
| Name | Type | Description |
|---|---|---|
conversation_id | str | Unique conversation / session identifier. |
Returns: list[Message] — Ordered list of Message objects (empty list when not
found).
ConversationStore.save
def save(self, conversation_id: str, messages: list[Any]) -> NonePersist the message history for conversation_id.
Overwrites any existing history for that ID.
Parameters:
| Name | Type | Description |
|---|---|---|
conversation_id | str | Unique conversation / session identifier. |
messages | list[Any] | Ordered list of Message objects to persist. |
ConversationStore.delete
def delete(self, conversation_id: str) -> NoneDelete the history for conversation_id.
Parameters:
| Name | Type | Description |
|---|---|---|
conversation_id | str | Unique conversation / session identifier. |
InMemoryConversationStore
class InMemoryConversationStore()In-memory store for full conversation histories.
Implements the ConversationStore protocol. Each conversation is keyed
by an arbitrary string identifier (typically a user ID or session UUID).
Deep copies are used on both load and save so that the caller
cannot inadvertently mutate stored data.
Example:
store = InMemoryConversationStore()
await store.save("session-abc", messages)
loaded = await store.load("session-abc")InMemoryConversationStore.load
def load(self, conversation_id: str) -> AnyLoad the conversation snapshot for conversation_id.
Returns an empty list when the conversation does not exist (backward
compat — callers that check if prior: still work on empty lists).
Parameters:
| Name | Type | Description |
|---|---|---|
conversation_id | str | Unique conversation identifier. |
Returns: dict[str, Any] | list[Any] — A deep copy of the stored snapshot. When the snapshot was
created by ShortTermMemory.snapshot() this is a
{"messages": [...], "summary": ...} dict; for legacy plain
lists the raw list is returned.
InMemoryConversationStore.save
def save(self, conversation_id: str, snapshot: Any) -> NonePersist the conversation snapshot for conversation_id.
Overwrites any existing entry for that identifier. A deep copy is stored to prevent the caller from mutating the stored data.
Accepts both the new dict snapshot format
({"messages": [...], "summary": ...}) produced by
ShortTermMemory.snapshot() and the legacy plain list[Message]
format so that code written against the old API continues to work.
Plain lists are automatically normalised to the dict format so that
load() always returns a consistent shape.
Parameters:
| Name | Type | Description |
|---|---|---|
conversation_id | str | Unique conversation identifier. |
snapshot | Any | Snapshot dict or message list to persist. |
InMemoryConversationStore.delete
def delete(self, conversation_id: str) -> NoneDelete the history for conversation_id.
Silently does nothing when the conversation does not exist.
Parameters:
| Name | Type | Description |
|---|---|---|
conversation_id | str | Unique conversation identifier. |
InMemoryConversationStore.list_conversations
def list_conversations(self) -> list[str]Return a sorted list of all stored conversation identifiers.
Returns: list[str] — Sorted list of conversation IDs.
InMemoryConversationStore.clear
def clear(self) -> NoneRemove all stored conversation histories.
Returns: None — None
Vector store
InMemoryVectorStore
class InMemoryVectorStore()In-memory vector store using TF-IDF cosine similarity.
Implements the MemoryStore protocol. Suitable for development and
testing; no external dependencies required.
Example:
store = InMemoryVectorStore()
doc_id = await store.upsert("The quick brown fox", metadata={"tag": "test"})
results = await store.search("quick fox", k=3)InMemoryVectorStore.upsert
def upsert(self, content: str, id: str | None = None, metadata: dict[str, Any] | None = None, embedding: list[float] | None = None) -> strInsert or update a document.
Parameters:
| Name | Type | Description |
|---|---|---|
content | str | The text content to store. |
id | str | None | Optional stable identifier. A UUID4 is generated when None. |
metadata | dict[str, Any] | None | Optional key/value metadata dict. |
embedding | list[float] | None | Pre-computed embedding vector as a list of floats. When provided it is used directly (after L2-normalisation); the TF-IDF computation is skipped. Must be dense and compatible with the cosine-similarity computation. |
Returns: str — The document's identifier.
InMemoryVectorStore.search
def search(self, query: str, k: int = 5, filter: dict[str, Any] | None = None) -> list[MemoryResult]Search for documents semantically similar to query.
Parameters:
| Name | Type | Description |
|---|---|---|
query | str | Natural-language query string. |
k | int | Maximum number of results to return. |
filter | dict[str, Any] | None | Optional metadata filter. Only documents whose metadata contains all specified key/value pairs are returned. |
Returns: list[MemoryResult] — Up to k results ordered by descending cosine similarity.
InMemoryVectorStore.get
def get(self, id: str) -> MemoryResult | NoneRetrieve a document by its identifier.
Parameters:
| Name | Type | Description |
|---|---|---|
id | str | Document identifier. |
Returns: MemoryResult | None — The MemoryResult, or None when not found.
InMemoryVectorStore.delete
def delete(self, ids: list[str]) -> NoneDelete documents by their identifiers.
Parameters:
| Name | Type | Description |
|---|---|---|
ids | list[str] | List of document identifiers to remove. |
InMemoryVectorStore.clear
def clear(self) -> NoneRemove all documents from the store.
Returns: None — None
User memory
MemoryFact
A single persisted fact about a user.
Facts are stored with a confidence score [0.0–1.0] that can decay over time if not reinforced.
UserMemoryStore
Protocol for user-level persistent memory stores.
InMemoryUserMemoryStore
In-process UserMemoryStore for testing and development.
Uses simple substring matching for search (no vector similarity).
@remember decorator
remember
Opt a @agent() class into automatic user memory extraction/injection.
Must be applied BELOW @agent():
@agent(model="claude-haiku-4-5")
@remember(store="user_memory", extract=True, inject=True, top_k=5)
class PersonalAssistant: ...When inject=True, relevant memories are prepended to the system prompt before each LLM call.
When extract=True, new facts are extracted from each conversation turn and stored in the UserMemoryStore.
Parameters:
| Name | Type | Description |
|---|---|---|
store | str | None | DI token name for UserMemoryStore (None = auto-inject). |
extract | bool | Extract new facts after each turn. |
inject | bool | Inject relevant memories before each turn. |
top_k | int | Number of memories to inject. |
extraction_model | str | None | Model for fact extraction (defaults to agent model). |