Short-Term Memory API Reference¶
This document is a reference for PenguiFlow’s built-in short-term memory (STM) API used by ReactPlanner.
If you’re looking for practical usage and examples, start with:
- Memory guide
- docs/migration/MEMORY_ADOPTION.md
Import Paths¶
- Planner:
from penguiflow.planner import ReactPlanner - Memory types:
from penguiflow.planner.memory import ...
Core Types¶
ConversationTurn¶
Atomic unit of STM. It represents a full user → assistant exchange.
Fields:
- user_message: str
- assistant_response: str
- trajectory_digest: TrajectoryDigest | None
- artifacts_shown: dict[str, Any]
- artifacts_hidden_refs: list[str]
- ts: float
TrajectoryDigest¶
Compressed record of tool usage for a turn (optional). This is intended to preserve “what happened” without storing full tool payloads.
Fields:
- tools_invoked: list[str]
- observations_summary: str
- reasoning_summary: str | None
- artifacts_refs: list[str]
Configuration¶
ShortTermMemoryConfig¶
Top-level configuration passed into ReactPlanner(short_term_memory=...).
Key fields:
- strategy: "none" | "truncation" | "rolling_summary"
- budget: MemoryBudget
- isolation: MemoryIsolation
- summarizer_model: str | None
- include_trajectory_digest: bool
- recovery_backlog_limit: int
- retry_attempts: int
- retry_backoff_base_s: float
- degraded_retry_interval_s: float
- token_estimator: Callable[[str], int] | None
Callbacks (best-effort, non-blocking):
- on_turn_added(turn)
- on_summary_updated(old, new)
- on_health_changed(old, new)
MemoryBudget¶
Controls how much STM content is retained and what happens under overflow.
full_zone_turns: number of recent turns kept as full messagessummary_max_tokens: max tokens allowed for the rolling summarytotal_max_tokens: overall cap for the memory payloadoverflow_policy:"truncate_summary" | "truncate_oldest" | "error"
MemoryIsolation¶
Defines how to derive a session key from tool_context.
tenant_key,user_key,session_key: dotted paths looked up intool_contextrequire_explicit_key: ifTrue, STM is fail-closed when no key can be resolved
MemoryKey¶
Explicit composite key used for STM isolation:
tenant_iduser_idsession_id
MemoryKey.composite() returns the storage key format: "{tenant}:{user}:{session}".
Public Protocols¶
ShortTermMemory¶
Minimal protocol required by ReactPlanner:
health -> MemoryHealthadd_turn(turn) -> Awaitable[None]get_llm_context() -> Awaitable[Mapping[str, Any]]estimate_tokens() -> intflush() -> Awaitable[None]
Optional extensions (duck-typed):
persist(store, key): called when the planner finishes successfullyhydrate(store, key): called before injecting memoryto_dict()/from_dict(state): for custom persistence backendsget_artifact(ref): for artifact lookups (optional)
Default Implementation¶
DefaultShortTermMemory¶
In-memory implementation that supports:
- truncation strategy
- rolling summary strategy (background LLM summarization when wired by
ReactPlanner) - failure → retry → degraded health transitions
- optional persistence via
save_memory_state/load_memory_stateon the store
The planner will create a DefaultShortTermMemory per MemoryKey when configured with ShortTermMemoryConfig.
Note:
- If you instantiate DefaultShortTermMemory directly with strategy="rolling_summary", you must provide a summarizer callable.
What Gets Injected Into llm_context¶
STM injects a JSON-friendly patch into llm_context:
{
"conversation_memory": {
"recent_turns": [{"user": "...", "assistant": "..."}],
"pending_turns": [{"user": "...", "assistant": "..."}],
"summary": "<session_summary>...</session_summary>"
}
}
Notes:
- pending_turns and summary are only present for rolling_summary when healthy.
- In degraded mode, STM falls back to recent_turns only.
ReactPlanner Integration Surface¶
ReactPlanner(..., short_term_memory=...)¶
Accepts either:
- ShortTermMemoryConfig (recommended)
- a custom ShortTermMemory implementation
ReactPlanner.run(..., memory_key=...) and ReactPlanner.resume(..., memory_key=...)¶
- If
memory_keyis provided, STM is scoped to it. - If omitted, the planner may derive it from
tool_contextusingMemoryIsolation. - If no key is available and
require_explicit_key=True, STM behaves as disabled for that call.
Errors¶
MemoryBudgetExceeded¶
Raised when budgets overflow and overflow_policy="error".
This is intended for “hard bound” environments where exceeding budgets is an operational error rather than a truncation signal.