Skills (playbooks and retrieval)¶

What it is / when to use it¶

“Skills” are named playbooks (human-authored or curated) that the planner can:

retrieve automatically as “relevant skills” at the start of a run,
discover by capability via skill_search,
fetch in full via skill_get,
list/paginate via skill_list.

Skills are designed for enterprise usage where you want:

standardized, reviewable operational procedures (“how we do X here”),
reuse across agents without copying prompt text,
a safe way to expose internal process knowledge without adding new tools.

They now support two authoring/delivery modes:

static local skill packs loaded into the SQLite store
runtime-provided skills supplied by the host app via Python APIs

Non-goals / boundaries¶

Skills are not “tools”: they do not execute anything on their own.
Skills are not long-term memory; they are curated documents stored in a local skill store.
Skills are not guaranteed to be correct for every environment; treat them like runbooks that must be reviewed and maintained.

Contract surface¶

Enabling skills¶

Enable skills by passing SkillsConfig(enabled=True, ...) into ReactPlanner(...):

ReactPlanner(..., skills=SkillsConfig(...))

Runtime-backed skills are also supported via Python-only extension points:

ReactPlanner(..., skills_provider=my_provider)
ReactPlanner(..., skills_provider_factory=my_factory)

When enabled, ReactPlanner:

creates a LocalSkillProvider for configured packs,
optionally composes it with your runtime provider,
loads configured skill packs into a local SQLite skill store,
injects skill discovery guidance into the system prompt,
makes these always-visible tools available:
skill_search, skill_get, skill_list

If skills_provider or skills_provider_factory is supplied and skills is omitted, PenguiFlow internally enables the skills subsystem with default settings. Passing a runtime provider while skills.enabled=False is a configuration error.

Skills configuration: `SkillsConfig`¶

Key knobs (from penguiflow.skills.models.SkillsConfig):

enabled: master switch
cache_dir: where the skill store SQLite DB lives
max_tokens: cap for how much skill context can be injected per run
top_k: how many relevant skills to retrieve automatically
summarize: whether to summarize skill payloads to fit the budget
redact_pii: redact PII in skill text before injection (recommended)
skill_packs: list of SkillPackConfig(name, path, format?, scope_mode?, ...)
directory: optional “known skills” directory rendering (useful for discoverability)
proposal: nested config for the opt-in skill_propose drafting tool
scope_mode: default scope for learned skills (packs can declare scope too)

Runtime providers¶

Custom providers implement the SkillProvider protocol from penguiflow.skills.provider.

skills_provider: inject a concrete provider instance
skills_provider_factory: build a provider per planner instance/fork
if both a local pack provider and a runtime provider are configured, PenguiFlow composes them with deterministic precedence
runtime/custom provider entries win on skill-name collisions; local packs remain fallback

This keeps host-controlled skill storage, tenancy, and review workflows outside the core planner while still exposing the standard skills tools and pre-flight retrieval flow.

Applicability filtering¶

Skills can now declare optional applicability metadata:

required_tool_names
required_namespaces
required_tags

These fields are enforced against the request’s allowed capability set, after tool policy / visibility filtering. A skill is considered applicable only if every populated requirement set is satisfied.

Applicability filtering is applied consistently to:

pre-flight relevant-skill injection
skill_search
skill_get
skill_list
skill directory rendering

This is useful for persona-style assistants where the skill should only appear when the matching capability is active, for example email skills that only surface when mail tools are allowed.

Using skills to steer rich output¶

Skills are also a strong way to improve rich-output behavior without overloading the base prompt catalog.

Typical uses:

teach the model when to choose render_report vs render_grid vs render_tabs
enforce a build-first / artifact_ref composition workflow for complex outputs
provide tenant- or persona-specific conventions for titles, captions, layout order, and chart selection
document how new custom renderers should be used after you add them

See Rich output with skills for the advanced guide.

Skill pack formats¶

The local pack loader supports:

Markdown: *.skill.md (YAML frontmatter)
YAML: *.skill.yaml / *.skill.yml
JSON: *.skill.json
JSONL: *.skill.jsonl

For Markdown, the file must contain YAML frontmatter with at least:

trigger: str (when to use it)
steps: list[str] (the playbook)

Automatic injection and directory blocks¶

When enabled, the runtime may inject (as prompt metadata):

<skills_context>: “relevant skills” (bounded by token budget)
<skill_directory>: a compact directory of known skills (optional)

These are LLM-visible; do not put secrets in skills.

Tool filtering interactions¶

Skill retrieval is tool-aware:

skill text can be redacted to avoid naming disallowed tools
if tool_search is available, skills can be rewritten to say “use tool_search” instead of naming a forbidden tool
applicability checks use allowed tools/namespaces/tags, including deferred-but-allowed tools

This helps prevent capability leakage when tool visibility differs by tenant/user.

Operational defaults (recommended)¶

Keep skills enabled only when you have curated content:
SkillsConfig(enabled=True, redact_pii=True, top_k=6, max_tokens=2000)
Store skill packs in-repo and version them like code.
Use scope keys in tool_context for multi-tenant deployments:
tenant_id, project_id (used by the provider’s scoping filter)
Enable the directory for discoverability in interactive UIs:
SkillsDirectoryConfig(enabled=True, include_fields=["name","title","trigger"])
Use runtime providers for per-user or per-persona skills; keep persistence and approval in the host app.

Drafting skills with `skill_propose`¶

skill_propose is an opt-in built-in tool for drafting a structured skill from freeform source material.

enable it with SkillsConfig(enabled=True, proposal={"enabled": True})
it returns a typed draft payload only
it does not save, persist, or publish skills
persistence/review remains the responsibility of the host app

Typical host workflow:

user or agent gathers source material
planner calls skill_propose
host app reviews or stores the returned draft

This keeps PenguiFlow’s default behavior safe and opt-in while still supporting agent-assisted skill authoring.

Failure modes & recovery¶

`skill_search is not configured` / `skill_get is not configured`¶

Likely cause

SkillsConfig.enabled=False, or no provider was created

Fix

enable skills on the planner and ensure packs exist.

`skill_propose is not configured`¶

Likely cause

skills.proposal.enabled=False

Fix

enable proposal drafting explicitly:
SkillsConfig(enabled=True, proposal={"enabled": True})

Skill pack missing / empty¶

If a pack path doesn’t exist or contains no valid skills, it will be ignored.

Fix

verify the pack path on the worker (container path differs from your laptop)
validate YAML frontmatter (trigger non-empty, steps non-empty)

Skills mention tools the user is not allowed to use¶

Fix

ensure tool visibility/policy is configured correctly for that tenant
keep redact_pii=True and avoid hard-coding sensitive tool names in skills
prefer writing skills to say “use tool_search for capability X” unless a tool is truly stable/public

Observability¶

Useful planner events:

skill_pack_loaded (pack name, skill count)
skills_retrieved (count, token estimates, was_summarized)
skill_search_query, skill_get, skill_list
skill_propose
skill_directory_rendered (entry count)

See Planner observability.

Security / multi-tenancy notes¶

Treat skills as LLM-visible: never store secrets, credentials, or internal-only identifiers.
Scope skills per tenant/project when applicable (use tenant_id/project_id in tool_context).
Treat cache_dir as a data store (permissions, backups, and lifecycle matter if you have learned skills).
For runtime providers, enforce host-side ACLs before returning skills to the planner.

Runnable example: load a temporary skill pack and call skill_search/skill_get¶

This example writes a .skill.md file to a temporary folder, enables skills, and uses a scripted client to exercise the skill tools.

from __future__ import annotations

import asyncio
import json
import tempfile
from collections.abc import Callable, Mapping, Sequence
from pathlib import Path
from typing import Any

from penguiflow.planner import PlannerFinish, ReactPlanner
from penguiflow.planner.models import JSONLLMClient
from penguiflow.skills.models import SkillPackConfig, SkillsConfig


class ScriptedClient(JSONLLMClient):
    def __init__(self) -> None:
        self._step = 0

    async def complete(
        self,
        *,
        messages: Sequence[Mapping[str, str]],
        response_format: Mapping[str, Any] | None = None,
        stream: bool = False,
        on_stream_chunk: Callable[[str, bool], None] | None = None,
    ) -> str:
        del messages, response_format, stream, on_stream_chunk
        self._step += 1
        if self._step == 1:
            return json.dumps({"next_node": "skill_search", "args": {"query": "incident", "limit": 5}}, ensure_ascii=False)
        if self._step == 2:
            return json.dumps({"next_node": "skill_get", "args": {"names": ["pack.demo.handle_incident"]}}, ensure_ascii=False)
        return json.dumps({"next_node": "final_response", "args": {"answer": "done"}}, ensure_ascii=False)


async def main() -> None:
    with tempfile.TemporaryDirectory() as tmp:
        root = Path(tmp)
        (root / "handle_incident.skill.md").write_text(
            """---
name: pack.demo.handle_incident
title: Handle a production incident
trigger: When an on-call incident is declared and you need a repeatable response.
task_type: domain
steps:
  - Confirm impact and affected tenants/projects.
  - Identify the failing dependency and roll back if needed.
  - Mitigate user impact first, then diagnose root cause.
failure_modes:
  - If metrics are missing, check telemetry pipeline health first.
---
""",
            encoding="utf-8",
        )

        planner = ReactPlanner(
            llm_client=ScriptedClient(),
            catalog=[],  # skills tools are injected automatically when skills.enabled=True
            skills=SkillsConfig(
                enabled=True,
                cache_dir=str(root / ".cache"),
                skill_packs=[SkillPackConfig(name="demo", path=str(root))],
            ),
        )

        result = await planner.run("demo", tool_context={"session_id": "demo", "tenant_id": "t1", "project_id": "p1"})
        assert isinstance(result, PlannerFinish)
        print(result.reason)


if __name__ == "__main__":
    asyncio.run(main())

Troubleshooting checklist¶

Did you pass skills=SkillsConfig(enabled=True, ...)?
Are your skill packs present on the runtime filesystem?
Are you scoping skill access with tenant_id / project_id in multi-tenant deployments?
Are you recording skills_retrieved and skill_* events?

Skills (playbooks and retrieval)¶

What it is / when to use it¶

Non-goals / boundaries¶

Contract surface¶

Enabling skills¶

Skills configuration: SkillsConfig¶

Runtime providers¶

Applicability filtering¶

Using skills to steer rich output¶

Skill pack formats¶

Automatic injection and directory blocks¶

Tool filtering interactions¶

Operational defaults (recommended)¶

Drafting skills with skill_propose¶

Failure modes & recovery¶

skill_search is not configured / skill_get is not configured¶

skill_propose is not configured¶

Skill pack missing / empty¶

Skills mention tools the user is not allowed to use¶

Observability¶

Security / multi-tenancy notes¶

Runnable example: load a temporary skill pack and call skill_search/skill_get¶

Troubleshooting checklist¶

Skills configuration: `SkillsConfig`¶

Drafting skills with `skill_propose`¶

`skill_search is not configured` / `skill_get is not configured`¶

`skill_propose is not configured`¶