Use the Harbor Protocol
The Harbor Protocol is the canonical event/state contract between Runtime and any client. The bundled Console is one consumer; this skill walks the path for building your own. A working chatbot UI is achievable in a day on top of the Protocol — the wire is small, typed, and stable.
Three properties make this practical:
- A generated, drift-gated contract reference — the published Protocol adoption track carries four pages (methods / events / errors / types) emitted by
cmd/harbor-gen-protocol-docsfrom the Go single sources and gated in CI bymake protocol-docs-gen-check, plus an executed quickstart, five choreography guides, a worked build-a-client walkthrough, and the conformance-certification path. For typed TS wire shapes, vendor the Console's hand-maintainedweb/console/src/lib/protocol.ts(the D-093 TS generator was deferred per D-132 —protocol.tsis hand-maintained today, kept honest by the Console's own CI). - Capability advertisement —
runtime.info.capabilitiestells you at attach which Protocol surfaces this Runtime advertises (task_control,events_subscribe,runtime_posture,topology_snapshot). Your UI degrades gracefully on stripped-down runtimes. - Stable Protocol versioning — breaking changes go through a deprecation window; same-major versions are compatible. Pin the major in your client; tolerate additive change. The full adopter contract is the published versioning & compatibility choreography.
The Protocol is what makes Harbor headless. The Runtime never imports Console code; the Console never reads internal Runtime objects. Your UI sits in the same posture as the Console.
1. The wire — base URL, auth, identity
The wire is REST-per-method: each Protocol method is its own route under /v1/, you POST a flat JSON body, and you get a flat JSON response back — there is no JSON-RPC envelope. Every request carries:
POST /v1/control/start HTTP/1.1
Host: 127.0.0.1:18080
Content-Type: application/json
Authorization: Bearer <JWT>
X-Harbor-Tenant: <tenant_id>
X-Harbor-User: <user_id>
X-Harbor-Session: <session_id>- Bearer JWT: RS256/RS384/RS512/ES256/ES384/ES512 signed token. Issuer + audience match the Runtime's
identity:block. Forharbor dev, the ephemeralHARBOR_DEV_TOKEN(printed on stderr) is what you use — seerun-the-dev-loop. X-Harbor-Session: the per-request session selector (D-171). The connection JWT verifies the WHO (tenant+user) and the scopes; the session is chosen per-conversation by this header and may differ on every request — the connection token is a per-backend credential, not a single-session pin. A new session id is a new conversation (create-on-first-use on the firststart). The token'ssessionclaim is a back-compat default used only when the header is absent.X-Harbor-Tenant/X-Harbor-Usercan never widen the JWT-verified principal. Every storage call still filters by the full(tenant, user, session)triple — no cross-session leakage. Full Console contract:docs/notes/session-model-contract.md.
Routes group by surface family:
- Task control —
startplus the nine steering verbs (cancel/pause/resume/redirect/inject_context/approve/reject/prioritize/user_message) all POST toPOST /v1/control/{method}(e.g./v1/control/start,/v1/control/cancel). The read-only posture methods (runtime.info,topology.snapshot) andartifacts.putshare this route shape. - Event stream —
GET /v1/events(SSE; see §4). - Read surfaces group by family under their own prefix:
POST /v1/tasks/{method}(e.g./v1/tasks/get),POST /v1/tools/{method},POST /v1/sessions/{method},POST /v1/memory/{method}, and so on.
The body is a flat JSON object — the method's request shape — with an identity object carrying the triple (or the headers above; the body's identity may be left empty when the headers supply it):
{ "identity": { "tenant": "dev", "user": "dev", "session": "dev" }, "query": "Hello, agent!" }The response is the method's flat response shape directly — no result / error wrapper. A failure is an HTTP status plus a {"code": "..."} envelope (e.g. 404 {"code": "unknown_method"}).
CORS is default-deny. For browser clients, your origin must be in the Runtime's server.allowed_origins. See run-the-dev-loop §2.
2. The handshake — runtime.info first
The first call your client makes:
curl -sS -X POST "$HARBOR_BASE_URL/v1/control/runtime.info" \
-H "Authorization: Bearer $TOKEN" \
-H "X-Harbor-Session: $SESSION" \
-H "Content-Type: application/json" \
-d '{"identity": {}}'A real response from a dev Runtime:
{
"instance_id": "harbor-dev-192.168.1.7",
"display_name": "harbor dev",
"build_version": "v0.0.0-dev",
"build_commit": "dev",
"build_go_version": "go1.26.3",
"protocol_version": "0.1.0",
"capabilities": ["events_subscribe", "runtime_posture", "task_control"],
"uptime_seconds": 16
}Two things to read and act on:
protocol_version— the wire-contract version (distinct frombuild_version, the Runtime's own release). Same major ⇒ compatible; on a major mismatch, warn loudly or refuse.capabilities— the advertised Protocol surfaces. Shape your UI on this list: a runtime that doesn't advertisetopology_snapshotgets the topology panel disabled, not a crash. A method outside the Runtime's registry returns the canonical404 {"code": "unknown_method"}envelope — treat it (and 405 / 501) as "not served here, degrade", the same SKIP posture Harbor's own smoke scripts encode.
3. Starting a task — the chat-message equivalent
curl -sS -X POST "$HARBOR_BASE_URL/v1/control/start" \
-H "Authorization: Bearer $TOKEN" \
-H "X-Harbor-Session: $SESSION" \
-H "Content-Type: application/json" \
-d '{"identity": {}, "query": "What'\''s the weather in Madrid?", "input_artifact_ids": []}'The request is the flat StartRequest: identity (the triple — empty here because the headers supply it), the query string, and the optional input_artifact_ids. There is no foreground field — every start mints a task and you observe it on the event stream.
Response is the flat StartResponse:
{
"task_id": "tsk_01HXYZ...",
"reused": false,
"protocol_version": "0.1.0"
}reused is true only when you supplied an idempotency_key that matched an existing task; protocol_version lets you detect a version skew.
For multimodal input, upload artifacts FIRST (artifacts.put, see §6) and pass the returned IDs in input_artifact_ids (D-166). The per-MIME dispatch — image inline vs PDF/audio as ArtifactStub — happens inside the planner; your client just passes refs. To override how an attachment is handed to the model, add the optional input_artifact_dispositions map (Phase 84b — D-189), keyed by artifact id with values ref | inline | provider_native | tool:<name> (e.g. {"art_x": "tool:pdf.extract"} forces the named catalog tool). Your hint is the top precedence layer (hint > the agent's multimodal.disposition config map > the runtime default: image inline, everything else ref); an omitted map keeps today's behaviour. tasks.get reflects the hint on input_artifacts[].disposition, and the resolution (including degradations — e.g. an unknown tool:<name>) is observable as task.input_disposition.resolved events. A provider_native hint is honoured end-to-end (Phase 84c — D-190): the LLM driver uploads the attachment to the provider's file surface and the upload is observable as llm.provider_file.uploaded events (artifact ref, provider, modality, file_id).
4. The events stream — SSE events.subscribe
The Protocol exposes events as Server-Sent Events:
GET /v1/events?access_token=<JWT>
Accept: text/event-stream
X-Harbor-Tenant: <tenant_id>
X-Harbor-User: <user_id>
X-Harbor-Session: <session_id>The subscription is identity-scoped — it streams the whole session's events — so there is no task_id query param. A client that can set headers narrows server-side with the optional X-Harbor-Run (a task id) and the repeatable X-Harbor-Event-Type headers. A browser EventSource (which can't set custom headers) authenticates via the ?access_token= query-param shim — same JWT, same identity triple, its session claim scoping the stream — and filters client-side on the event payload's task id. The query-param shim is documented in internal/protocol/transports/transports.go.
The stream is a sequence of event: <type>\ndata: <JSON>\n\n blocks:
event: llm.completion.chunk
data: {"task_id":"tsk_01HXYZ","chunk":"Hello"}
event: llm.completion.chunk
data: {"task_id":"tsk_01HXYZ","chunk":" there!"}
event: tool.invoked
data: {"task_id":"tsk_01HXYZ","tool":"weather.get_current","args":{"city":"Madrid"}}
event: tool.result
data: {"task_id":"tsk_01HXYZ","tool":"weather.get_current","result":{"temperature_c":21.3}}
event: task.completed
data: {"task_id":"tsk_01HXYZ","status":"completed"}A gotcha: the event payload's task ID field is payload.TaskID (capital T) — match exactly when parsing in JS/TS. Documented in the Console's chat panel handler; easy to miss when hand-rolling.
For a chat UI, you'd:
- Append a "user turn" bubble to the chat.
- POST
start, gettask_id. - Open an SSE stream for that
task_id. - Append
llm.completion.chunkcontent to a streaming "assistant turn" bubble. - Render
tool.invoked/tool.resultas collapsed cards inside the assistant bubble. - Close the bubble on
task.completed.
5. Pause + steer + resume
The unified pause/resume primitive (RFC §3.3) is one wire choreography for every cause — HITL approval, tool-side OAuth, operator pause. The steering verbs share one route shape, POST /v1/control/{method}, with the run id and your steering scope in the body's identity:
# park the run at the next planner-step boundary
curl -sS -X POST "$HARBOR_BASE_URL/v1/control/pause" \
-H "Authorization: Bearer $TOKEN" -H "X-Harbor-Session: $SESSION" \
-H "Content-Type: application/json" \
-d '{"identity": {"run": "'$TASK_ID'", "scope": "owner_user"}}'
# feed it context while parked, then wake it
curl -sS -X POST "$HARBOR_BASE_URL/v1/control/inject_context" \
-H "Authorization: Bearer $TOKEN" -H "X-Harbor-Session: $SESSION" \
-H "Content-Type: application/json" \
-d '{"identity": {"run": "'$TASK_ID'", "scope": "session_user"}, "payload": {"note": "Actually, make it Barcelona."}}'
curl -sS -X POST "$HARBOR_BASE_URL/v1/control/resume" \
-H "Authorization: Bearer $TOKEN" -H "X-Harbor-Session: $SESSION" \
-H "Content-Type: application/json" \
-d '{"identity": {"run": "'$TASK_ID'", "scope": "owner_user"}}'The 200 {"accepted": true, …} means enqueued; the effect is narrated on the event stream — pause.requested when the run parks, pause.resumed (with a typed Decision of approve / reject / resume / timeout) when it wakes. The planner sees injected context on its next step.
For HITL: an approval-gated tool emits tool.approval_requested with a pause token; your UI routes the human verdict through POST /v1/control/approve or /reject with "payload": {"token": "<pause-token>", "reason": "…"}. POST /v1/pause/list is the snapshot of everything currently awaiting a human — reconcile against it on every (re)attach; it is authoritative across Runtime restarts. The full wire choreography (including the OAuth callback leg and DecisionTimeout reaps) is the published pause-model choreography.
The "steer vs queue" UI choice in drive-the-playground §3 maps directly to "POST /v1/control/pause + inject + resume" vs "wait for task.completed then POST a new start".
6. Artifact upload — multimodal input
For images / PDFs / audio uploads from your UI, artifacts.put is a control-surface method: POST the bytes (base64-encoded inline on the request leg) and you get back a reference, never an echo of the body:
curl -sS -X POST "$HARBOR_BASE_URL/v1/control/artifacts.put" \
-H "Authorization: Bearer $TOKEN" \
-H "X-Harbor-Session: $SESSION" \
-H "Content-Type: application/json" \
-d '{
"scope": {"tenant": "dev", "user": "dev", "session": "dev"},
"bytes": "'"$(base64 < report.pdf)"'",
"opts": {"mime_type": "application/pdf", "filename": "report.pdf"}
}'Response carries the canonical ref:
{
"ref": {
"id": "art_01H...",
"mime_type": "application/pdf",
"size_bytes": 142853,
"filename": "report.pdf"
},
"protocol_version": "0.1.0"
}Pass ref.id in start's input_artifact_ids. The upload bytes ride the request leg only (base64-inline, bounded by the Runtime's max request size — an oversize body is rejected with request_too_large); the response is a reference, and bytes never reach the LLM edge inline.
7. Topology snapshot — render the runtime's wiring
curl -sS -X POST "$HARBOR_BASE_URL/v1/control/topology.snapshot" \
-H "Authorization: Bearer $TOKEN" \
-H "X-Harbor-Session: $SESSION" \
-H "Content-Type: application/json" \
-d '{"identity": {}}'Response is a graph of components + edges — Bifrost, tool catalog (with per-tool nodes), memory driver, state driver, artifact store, event bus, skill catalog. The Console's Topology page is one consumer; your custom dashboard could be another.
The capability is topology.snapshot: true (V1.1 phase 84a).
8. Typed wire shapes — where they actually come from
Two trustworthy sources, neither of which is hand-rolling:
- The generated contract reference — the generated types page catalogues every canonical wire struct field-by-field with the snake_case JSON keys, generated by
cmd/harbor-gen-protocol-docsfrom the Go single sources and drift-gated bymake protocol-docs-gen-check. Transcribe your client types from it; when the wire changes, the page changes in the same PR by construction. - Vendor the Console's client module — copy
web/console/src/lib/protocol.tsinto your TS client. It carries the wire types + the typedHarborClient. It is hand-maintained (the D-093 TS generator was deferred per D-132 / issue #179), kept honest by the Console's CI rather than by codegen. License is Apache-2.0; attribution required.
Hand-rolling the types from scratch is fine for a quick prototype but you'll drift. Anchor any client you intend to maintain on the generated reference.
9. A minimal client (TS, ~30 LoC)
const baseUrl = "http://127.0.0.1:18080";
const token = "<HARBOR_DEV_TOKEN>";
const identity = { tenant: "dev", user: "dev", session: "dev" };
// One REST call per method: POST /v1/<family>/<method>, flat body in, flat body out.
async function call<T>(route: string, body: object): Promise<T> {
const res = await fetch(`${baseUrl}${route}`, {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${token}`,
"X-Harbor-Tenant": identity.tenant,
"X-Harbor-User": identity.user,
"X-Harbor-Session": identity.session,
},
body: JSON.stringify(body),
});
if (!res.ok) {
const err = await res.json().catch(() => ({ code: `http_${res.status}` }));
throw new Error(err.code ?? `http_${res.status}`);
}
return res.json() as Promise<T>;
}
const info = await call("/v1/control/runtime.info", { identity: {} });
console.log("connected to harbor", info);
const { task_id } = await call<{ task_id: string }>("/v1/control/start", { identity: {}, query: "Hello!" });
// The stream is session-scoped (no task_id query param), so filter client-side.
const sse = new EventSource(`${baseUrl}/v1/events?access_token=${encodeURIComponent(token)}`);
sse.addEventListener("llm.completion.chunk", (e) => {
const data = JSON.parse(e.data);
if (data.task_id === task_id) process.stdout.write(data.chunk);
});
sse.addEventListener("task.completed", (e) => {
if (JSON.parse(e.data).task_id === task_id) sse.close();
});That's a working CLI chatbot in 30 lines. Wrap the same in React/Svelte/Vue/whatever your stack is, render the chunks into a bubble, and you have a chat UI.
Common failure modes
- Every call returns 401. Token expired (24h TTL) or rotated (
harbor devrestarted). Re-fetch token, retry. - CORS preflight fails. Your origin isn't in
server.allowed_origins. Add it to the yaml + restart Runtime. - SSE stream opens but no events. The
payload.TaskIDcapital-T gotcha — your handler is readingpayload.task_id(lowercase). Fix the case. - A control call returns
404 {"code": "unknown_method"}or 405/501. This runtime doesn't serve that surface. Callruntime.infofirst, branch oncapabilities, and degrade the feature instead of crashing (the versioning & compatibility contract). - Artifact upload returns 413 Payload Too Large. The request body exceeded the Runtime's
protocol.max_request_bytes(default 4 MiB) — the canonical{"code": "request_too_large"}envelope. Chunk uploads aren't supported in V1.1; raiseprotocol.max_request_bytesin the Runtime'sharbor.yamlif you need larger inline uploads. - Topology snapshot rejected. This Runtime doesn't advertise the
topology_snapshotcapability — checkruntime.info.capabilitiesbefore enabling the panel. - The Console reads internal Runtime objects. It doesn't — that would be a CLAUDE.md §13 violation. If you suspect leakage, file a bug; the Console reads only what's documented as a Protocol surface.
See also
- The Protocol adoption track — the published quickstart + generated reference + choreographies: the adopter-facing contract docs this skill's recipes sit on top of. Start there for any client you intend to maintain. The track is complete: five choreographies (including the pause model and versioning & compatibility), the worked build-a-client guide (its ~150-line SDK-free event viewer ships at
examples/protocol-clients/event-viewer/), and the conformance-certification path. run-the-dev-loop— boot the Runtime + grab the dev token first.drive-the-playground— the Console's chat UI; same Protocol underneath.observe-with-the-console— every Console page maps 1:1 to a Protocol method.- The wire types:
internal/protocol/types/. - The methods registry:
internal/protocol/methods/methods.go. - The error codes:
internal/protocol/errors/errors.go. - The docs generator:
cmd/harbor-gen-protocol-docs/(D-209). The D-093 TS generator remains deferred (D-132 / issue #179);protocol.tsis hand-maintained. - RFC §5 — Harbor Protocol design.