ReactPlanner configuration (production patterns)¶
What it is / when to use it¶
This page is a configuration playbook for ReactPlanner in production services.
Use it when you need:
- a safe “default” configuration you can ship,
- concrete examples for multi-tenant memory + guardrails + steering,
- background task wiring (tasks.* tools and tool-initiated background jobs),
- a clear contract for what goes into
llm_contextvstool_context.
Non-goals / boundaries¶
- This page is not a full API reference for every argument on
ReactPlanner. - This page does not document ToolNode transports in depth (see Tooling).
- This page does not define a single “best” architecture; it gives known-good starting points.
Contract surface¶
Planner construction vs per-call inputs¶
Construction time (static policy + defaults):
- LLM integration:
llm="..."/llm_client=.../use_native_llm=True - tool catalog:
catalog=[NodeSpec, ...](ornodes=[Node, ...]+registry=ModelRegistry) - budgets:
max_iters,deadline_s,hop_budget,token_budget - safety:
tool_policy,guardrail_gateway,observation_guardrail - memory:
short_term_memory=ShortTermMemoryConfig(...) - background tasks:
background_tasks=BackgroundTasksConfig(...) - observability:
event_callback,stream_final_response
Per call (tenant/session scoped):
queryllm_context: JSON-only, LLM-visible contexttool_context: privileged tool runtime context (clients/secrets/callbacks)memory_key: explicit memory isolation key (MemoryKey)tool_visibility: dynamic per-call tool filtering (multi-tenant allowlists)steering: runtime control plane (SteeringInbox)
Warning
Treat llm_context and tool_context as a security boundary:
secrets and privileged objects belong in tool_context only.
Required tool_context keys (common patterns)¶
There is no single required key, but many subsystems use conventional keys:
session_id: str— enables per-session dispatch (planner concurrency hotfix) and is required by tasks.* tools.tenant_id/user_id: str— used by memory isolation when deriving keys.task_service: TaskService— required for tasks.* tools and tool-initiated background spawns.
Operational defaults (enterprise-safe baseline)¶
These defaults are intended to be safe in multi-tenant services:
temperature=0.0,json_schema_mode=Truemax_iters=8(lower if you have strict latency SLOs)deadline_sset per request (or set on the planner for a global ceiling)llm_timeout_saggressively bounded; keepllm_max_retriessmalltool_policydeny-by-default for write/external tools unless explicitly enabled per tenant- memory isolation: pass
memory_key=MemoryKey(...)explicitly per call - observation clamp: keep
ObservationGuardrailConfig()enabled (default)
Configuration recipes¶
1) Minimal “local” planner (no memory, no steering)¶
from __future__ import annotations
from pydantic import BaseModel
from penguiflow import ModelRegistry, Node
from penguiflow.catalog import build_catalog, tool
from penguiflow.planner import ReactPlanner, ToolContext
class EchoArgs(BaseModel):
payload: dict
class EchoOut(BaseModel):
payload: dict
@tool(desc="Echo input", side_effects="pure")
async def echo(args: EchoArgs, ctx: ToolContext) -> EchoOut:
del ctx
return EchoOut(payload=args.payload)
def build_planner() -> ReactPlanner:
registry = ModelRegistry()
registry.register("echo", EchoArgs, EchoOut)
catalog = build_catalog([Node(echo, name="echo")], registry)
return ReactPlanner(llm="gpt-4o-mini", catalog=catalog)
2) Multi-tenant service baseline (memory + strict budgets)¶
This pattern:
- uses explicit memory keys (fail-closed isolation),
- enforces budgets and timeouts,
- keeps secrets out of
llm_context, - emits events for observability.
from __future__ import annotations
from collections.abc import Mapping
from typing import Any
from penguiflow import ModelRegistry, Node
from penguiflow.catalog import build_catalog, tool
from penguiflow.planner import MemoryKey, PlannerEvent, ReactPlanner, ToolContext
from penguiflow.planner.memory import MemoryBudget, MemoryIsolation, ShortTermMemoryConfig
from penguiflow.planner.models import ObservationGuardrailConfig, ToolPolicy
class GetStatusArgs(BaseModel):
pass
class GetStatusOut(BaseModel):
ok: bool
@tool(desc="Read-only demo tool", side_effects="read", tags=["read"])
async def get_status(args: GetStatusArgs, ctx: ToolContext) -> GetStatusOut:
del args, ctx
return GetStatusOut(ok=True)
def build_planner() -> ReactPlanner:
registry = ModelRegistry()
registry.register("get_status", GetStatusArgs, GetStatusOut)
catalog = build_catalog([Node(get_status, name="get_status")], registry)
stm = ShortTermMemoryConfig(
strategy="rolling_summary",
budget=MemoryBudget(full_zone_turns=5, summary_max_tokens=800, total_max_tokens=8000),
isolation=MemoryIsolation(require_explicit_key=True),
summarizer_model="gpt-4.1-mini",
)
def on_event(e: PlannerEvent) -> None:
# Ship this to your structured logger / traces.
_ = e.to_payload()
return ReactPlanner(
llm="gpt-4o-mini",
catalog=catalog,
short_term_memory=stm,
observation_guardrail=ObservationGuardrailConfig(
max_observation_chars=50_000,
auto_artifact_threshold=20_000,
),
tool_policy=ToolPolicy(
allowed_tools={"get_status"},
),
temperature=0.0,
json_schema_mode=True,
max_iters=8,
llm_timeout_s=45.0,
llm_max_retries=2,
event_callback=on_event,
)
async def handle_request(
planner: ReactPlanner,
*,
tenant_id: str,
user_id: str,
session_id: str,
query: str,
tool_context: Mapping[str, Any],
) -> Any:
# Secrets/clients live in tool_context; never in llm_context.
result = await planner.run(
query,
tool_context={**dict(tool_context), "session_id": session_id},
memory_key=MemoryKey(tenant_id=tenant_id, user_id=user_id, session_id=session_id),
llm_context={"ui": {"locale": "en-US"}},
)
return result
3) Native LLM layer + guardrails (recommended for “real” production)¶
Use this when you want:
- more predictable structured output handling across providers,
- native reasoning streaming (where supported),
- a centralized guardrail policy pack.
This example shows end-to-end wiring for:
- native LLM (
use_native_llm=True) - a minimal guardrail gateway (tool allowlist + secret redaction)
- background task tools (tasks.*) in the catalog
- a steering inbox attached per call
from __future__ import annotations
from pydantic import BaseModel
from penguiflow import ModelRegistry, Node
from penguiflow.catalog import build_catalog, tool
from penguiflow.planner import ReactPlanner, ToolContext
from penguiflow.planner.guardrails import (
GatewayConfig,
GuardrailGateway,
RuleRegistry,
SecretRedactionRule,
ToolAllowlistRule,
)
from penguiflow.planner.guardrails.async_eval import AsyncRuleEvaluator
from penguiflow.planner.models import BackgroundTasksConfig
from penguiflow.sessions.task_tools import build_task_tool_specs
from penguiflow.steering import SteeringInbox
from penguiflow.steering.guard_inbox import InMemoryGuardInbox
class PingArgs(BaseModel):
pass
class PingOut(BaseModel):
ok: bool
@tool(desc="Health check", side_effects="read", tags=["read"])
async def ping(args: PingArgs, ctx: ToolContext) -> PingOut:
del args, ctx
return PingOut(ok=True)
def build_enterprise_planner() -> ReactPlanner:
# App tools
registry = ModelRegistry()
registry.register("ping", PingArgs, PingOut)
app_catalog = build_catalog([Node(ping, name="ping")], registry)
# Background task meta-tools (tasks.*)
task_catalog = build_task_tool_specs()
# Guardrails (minimal policy pack)
rules = RuleRegistry()
rules.register(
ToolAllowlistRule(
allowed_tools=frozenset(
{
"ping",
# tasks.* tools (if you include them in the catalog)
"tasks.spawn",
"tasks.list",
"tasks.get",
"tasks.cancel",
"tasks.apply_patch",
}
)
)
)
rules.register(SecretRedactionRule())
inbox = InMemoryGuardInbox(AsyncRuleEvaluator(rules))
gateway = GuardrailGateway(
registry=rules,
guard_inbox=inbox,
config=GatewayConfig(mode="enforce"),
)
return ReactPlanner(
llm={"model": "openai/gpt-4o-mini"},
use_native_llm=True,
catalog=[*task_catalog, *app_catalog],
background_tasks=BackgroundTasksConfig(enabled=True),
guardrail_gateway=gateway,
stream_final_response=True,
)
async def handle_interactive_request(planner: ReactPlanner) -> None:
steering = SteeringInbox()
# Provide a TaskService implementation via tool_context["task_service"] in real deployments.
await planner.run(
"Check health; if slow, spawn in background.",
tool_context={"session_id": "sess_123"},
steering=steering,
)
See Native LLM layer, Guardrails, Steering, and Background tasks for operational guidance and failure modes.
Failure modes & recovery¶
“Everything is serialized” (low throughput)¶
Symptoms
- concurrent requests stall behind each other even for different users
Likely cause
- no
session_idis present, so planner falls back to a global lock to protect internal mutable state
Fix
- pass
tool_context["session_id"](or useMemoryKey.session_id) for every call.
Memory configured but never appears¶
See Memory (key derivation vs explicit keys, health states, budgets).
Background tasks don’t spawn¶
See Background tasks (requires task_service in tool_context and BackgroundTasksConfig.enabled=True).
Observability¶
At minimum, record:
PlannerEvent.event_typeandlatency_ms(LLM and tools),- finish reasons (
answer_complete/no_path/budget_exhausted), - pause/resume and guardrail decisions (redacted).
Security / multi-tenancy notes¶
- Prefer per-tenant
tool_visibility/tool_policyinstead of stuffing “available tools” into the prompt. - Treat
resume_tokenand any steering control messages as authorization capabilities. - Use guardrails to prevent secret leakage in streamed output (see Guardrails).
Runnable examples¶
- Guardrails examples:
uv run python examples/guardrails/huggingface/flow.py - ToolNode integrations: see
examples/(each example is runnable viauv run python ...)
Troubleshooting checklist¶
- Is
tool_context["session_id"]set for every call? - Are you keeping
llm_contextJSON-only? - Are memory keys explicit (
memory_key=MemoryKey(...)) in multi-tenant services? - Are
event_callbacklogs being recorded and searchable? - If using tasks.* tools, is
tool_context["task_service"]configured?