Logging¶
What it is / when to use it¶
PenguiFlow uses standard Python logging and emits structured runtime events (FlowEvent) that you can capture and forward to your logging/telemetry stack.
Use this page when you want:
- consistent JSON logs for production,
- trace-correlated debugging (
trace_id), - a reliable way to observe node lifecycle (success/error/timeout/retry).
Non-goals / boundaries¶
- PenguiFlow does not ship a vendor-specific logging integration (Datadog, ELK, Splunk).
- The library does not automatically redact secrets/PII; you must avoid logging sensitive payloads.
- This page is about logs; metrics and alerting are covered separately.
Contract surface¶
configure_logging(...)¶
configure_logging configures the penguiflow logger (and its children) with either:
- structured JSON lines (
structured=True), or - human-readable logs, optionally appending
extra={...}fields (include_extras=True).
from penguiflow import configure_logging
configure_logging(level="INFO", structured=True)
Runtime FlowEvent¶
The runtime emits FlowEvent objects with:
event_type(e.g.node_success,node_error,node_timeout,node_retry,deadline_skip, …),- queue depth (incoming/outgoing),
- latency (ms) when applicable,
- trace-level inflight/pending counts when a
trace_idexists.
You can:
- log them directly (
event.to_payload()), and/or - derive metrics (
event.metric_samples()).
log_flow_events(...) middleware¶
log_flow_events is a ready-to-use middleware that logs node lifecycle events with structured extra payloads.
import logging
from penguiflow import create, log_flow_events
flow = create(
...,
middlewares=[log_flow_events(logging.getLogger("penguiflow.flow"))],
)
Operational defaults (recommended)¶
- Turn on JSON logs in production:
configure_logging(structured=True). - Log with stable identifiers:
trace_id(correlation),node_name/node_id,tenant(if multi-tenant).- Avoid logging large payloads; prefer artifacts/resources and log references.
- Keep log volumes bounded (sampling or level tuning) for
node_startevents if needed.
Failure modes & recovery¶
- No FlowEvents in logs: you didn’t attach middleware. Fix: pass
middlewares=[log_flow_events(...)]tocreate(...). - Duplicate logs: you configured both root logger handlers and
configure_logging, or you enabled propagation incorrectly. Fix: use one logging entrypoint; keeppenguiflowlogger configured once per process. - Logs missing
trace_id: you’re using payload-only messages or not setting/propagating trace ids. Fix: use envelopes (Message) and trace-scoped emit/fetch. - Sensitive data leaked: you logged raw tool outputs/payloads. Fix: redact at boundaries; log only ids + summaries.
Observability (what to log)¶
Minimum recommended log events:
- runtime:
node_error,node_failed,node_timeoutat WARNING/ERROR,node_successat INFO (optionally sampled),node_retryat INFO,trace_cancel_start/trace_cancel_finishat INFO,deadline_skipat INFO.- application:
- job/request received (with
trace_id,tenant), - job/request completed (latency + status),
- external dependency failures (tool boundary).
Security / multi-tenancy notes¶
- Never include secrets in log fields. Assume logs are broadly accessible in an enterprise.
- Do not tag metrics by
trace_id(cardinality explosion); for logs,trace_idis correct and expected. - Treat
trace_idas sensitive if it can be used to fetch/cancel work in your app.
Runnable example: structured FlowEvent logging¶
from __future__ import annotations
import asyncio
import logging
from penguiflow import Node, NodePolicy, configure_logging, create, log_flow_events
async def nop(msg, _ctx):
return msg
async def main() -> None:
configure_logging(structured=True)
logger = logging.getLogger("penguiflow.flow")
node = Node(nop, name="nop", policy=NodePolicy(timeout_s=5, max_retries=0))
flow = create(node.to(), middlewares=[log_flow_events(logger)])
flow.run()
await flow.emit({"hello": "world"})
_ = await flow.fetch()
await flow.stop()
if __name__ == "__main__":
asyncio.run(main())
Troubleshooting checklist¶
- Need production event patterns: see Telemetry patterns.
- Need dashboards + alerts: see Metrics & alerts.