Deployment¶

What it is / when to use it¶

This section explains how to run PenguiFlow as a long-lived production system:

Use this page to decide which deployment shape you want, and which operational “contracts” you must implement.

These docs do not replace your org’s platform standards (Kubernetes, service mesh, secrets, CI/CD).
PenguiFlow does not ship a full “platform” (no built-in job queue, metrics exporter, or authz layer).
“Distributed execution” is opt-in; you can run a single-process deployment safely with the core runtime.

Payload-only is fastest to start and works well for single-tenant pipelines.
Envelopes (Message(payload=..., headers=Headers(...), trace_id=...)) are the production-friendly default:
per-trace cancellation and deadlines,
deterministic correlation (fetch(trace_id=...)),
streaming chunks with inherited metadata,
multi-tenant isolation via Headers.tenant.

Every node has a NodePolicy that defines:

Edges are bounded queues (queue_maxsize) so the system applies backpressure instead of unbounded buffering.

In production you often need one (or both):

structured events (FlowEvent) for debugging and metrics, and/or
a StateStore to persist StoredEvent (audit history, distributed pause/resume, replay-friendly event capture).

If you use ReactPlanner and ToolNode integrations, deployment must include:

Requests get “mixed up”: you are using unscoped fetch() with multiple concurrent traces. Fix: use unique trace_id per request and fetch(trace_id=...).
Workers “hang”: no egress message reaches Rookery. Fix: ensure at least one egress node exists and emits/returns a final value.
Memory growth: unbounded edges (queue_maxsize <= 0) or large payloads in events. Fix: keep queues bounded; use artifacts/resources for large blobs.
Retry storms: retries amplify load on downstream dependencies. Fix: reduce retries, add timeouts, and implement idempotency on side-effecting nodes.

PenguiFlow emits FlowEvent around node execution and control-plane behavior (timeouts, retries, cancellation, deadline skips).

Operationally:

Always set Headers.tenant and keep tenant boundaries consistent across a trace.
Never store secrets in message payloads or meta if you persist events/logs.
Treat trace_id as an authorization surface (don’t allow cross-tenant fetch/cancel).

uv run python examples/quickstart/flow.py
uv run python examples/roadmap_status_updates/flow.py

Need job workers: start with Worker integration.
Need production hardening (limits, rollout, multi-tenant defaults): use Production deployment.
Need to understand runtime behavior: see Core runtime.