UNIHODL · Agent Handoff SDK · Technical Specification v1.0

Agent Handoff SDK

The decision-continuity protocol for human-to-agent and agent-to-agent work transfer.

Quickstart · 5 minutes

Install the SDK, mint a scoped key, and pass a UNIHODL resume token into your agent loop. An agent that receives a resume token starts its turn already knowing what the human decided, what they read, and what they intended to do next.

1. Install

bash
npm install @unihodl/agent-sdk
# or
pip install unihodl-agent

2. Mint a scoped resume token (server-side)

bash
curl https://unihodl.app/api/v1/resume_tokens \
  -H "Authorization: Bearer $UNIHODL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "ses_8f3aZ91b",
    "scopes": ["read:context", "read:reasoning"],
    "ttl_seconds": 3600,
    "audience": "claude.anthropic.com"
  }'

3. Hand it off to an agent

python
from unihodl_agent import Client
from anthropic import Anthropic

uh = Client(api_key=os.environ["UNIHODL_API_KEY"])
claude = Anthropic()

# Hydrate the session — returns a structured ResumeContext
ctx = uh.sessions.hydrate("ses_8f3aZ91b")

# Pass the serialized context as the first system message
resp = claude.messages.create(
    model="claude-sonnet-4-7",
    max_tokens=2048,
    system=ctx.as_system_prompt(),
    messages=[
        {"role": "user", "content": "Continue Sarah's research on API v3."}
    ],
)
print(resp.content[0].text)

What just happened

The agent received the human’s open tabs with scroll positions, the AI-tagged decision thread, partial conclusions, and intended next step — not just a list of links. It can pick up the work mid-thought.

1. Overview & Design Principles

Most agentic AI failures are not reasoning failures — they are cold-start failures. The agent begins each session with no idea what the human already concluded, what tabs they had open, or which option they were leaning toward. UNIHODL’s Agent Handoff SDK closes that gap by exposing the human’s working session as a first-class, machine-readable artifact.

The four primitives

PrimitiveDefinition
SessionAn open work context — tabs, scroll positions, video timestamps, partial notes — captured by UNIHODL on the device.
Resume TokenA signed, scoped, time-bound credential that grants an agent read (and optionally write) access to a session.
Resume ContextThe hydrated payload an agent receives — structured, redacted, serializable into any model's prompt format.
Reasoning ThreadThe why-layer: partial conclusions, decision stance, blockers, next-step intent.

Design principles

  • Audience-bound by default. Every resume token is bound to a single audience (a model vendor, an MCP server URL, or a known agent identity). Tokens cannot be replayed against other audiences.
  • Redaction at the boundary. Sensitive content is redacted server-side before serialization based on per-workspace policies — the agent never sees content it isn't entitled to.
  • Reasoning is structured, not narrative. Decision threads are serialized as typed graph nodes, not free text, so agents can ground tool calls in specific intent vectors.
  • Format-pluggable. The same Resume Context can serialize to a Claude system prompt, a Gemini system instruction, an OpenAI Agents SDK input, or a raw MCP resource.
  • Auditable. Every hydration creates an immutable access record: who, when, what scopes, what was redacted, what was returned.

2. Resume Token Data Schema

A resume token is a JWS-signed JWT (RFC 7519) issued by api.unihodl.appand consumed by either UNIHODL’s hydration endpoint or, when scopes permit, directly by an agent that validates against UNIHODL’s JWKS.

Token claims

ClaimTypeDescription
issstringAlways https://api.unihodl.app — issuer.
substringWorkspace-scoped subject, e.g. wks_3kQ:usr_8aZ.
audstringBound audience: model vendor or MCP server URL.
jtistringToken ID. Used to revoke.
iat / nbf / expintStandard JWT timing claims. exp ≤ iat + 86400.
scopestring[]Space-delimited capability list (see §5).
session_idstringBound session, e.g. ses_8f3aZ91b.
redaction_policystringNamed policy applied during hydration.
max_hydrationsintReplay cap. 1 = single-use.
delegatorobject?If issued via handoff, the human who consented.
noncestring?Required for write-scoped tokens.

Example token (decoded)

json
{
  "iss": "https://api.unihodl.app",
  "sub": "wks_3kQ:usr_8aZ",
  "aud": "claude.anthropic.com",
  "jti": "rt_01HW7KQ8X2N3C9V0F4R6PD",
  "iat": 1746480000,
  "nbf": 1746480000,
  "exp": 1746483600,
  "scope": ["read:context", "read:reasoning", "read:redacted"],
  "session_id": "ses_8f3aZ91b",
  "redaction_policy": "default-strict",
  "max_hydrations": 5,
  "nonce": null
}

Resume Context payload (returned by /hydrate)

json
{
  "session_id": "ses_8f3aZ91b",
  "version": "1.0",
  "captured_at": "2026-05-05T22:14:08Z",
  "title": "GraphQL migration for API v3",
  "summary": "Researching whether to migrate API v3 from REST to GraphQL.",
  "tabs": [
    {
      "url": "https://blog.apollographql.com/rest-vs-graphql",
      "title": "REST vs GraphQL — performance",
      "scroll_y_pct": 0.62,
      "selected_text": "GraphQL batching outperforms ...",
      "tab_role": "primary_source"
    }
  ],
  "media": [
    { "kind": "video", "url": "youtu.be/abc", "timestamp_s": 1843,
      "transcript_anchor": "...the unified resolver pattern ..." }
  ],
  "reasoning_thread": [
    { "kind": "observation", "text": "REST batching wins on read-heavy endpoints." },
    { "kind": "decision_stance", "text": "Leaning hybrid; not full GraphQL." },
    { "kind": "blocker", "text": "Need finance approval for vendor B." },
    { "kind": "next_step", "text": "Draft hybrid RFC; meet with platform team." }
  ],
  "ai_tags": ["api-v3", "graphql", "hybrid-architecture"],
  "redactions": [{ "field": "tabs[2].selected_text", "reason": "PII" }]
}

3. Serialization, Compression & Redaction

Resume Contexts are designed to be small enough to fit comfortably in a 200K-token window alongside the agent’s task, structured enough to reason over, and safe enough to send across trust boundaries. The pipeline is deterministic: capture → normalize → redact → compress → serialize.

Wire formats

FormatMIME typeWhen to use
Canonical JSONapplication/vnd.unihodl.context+jsonDefault. Best for inspection, debugging, and most agent inputs.
Compact CBORapplication/vnd.unihodl.context+cborMobile clients, low-bandwidth handoffs, edge MCP servers.
Prompt-ready texttext/x-unihodl-promptPre-rendered for direct concatenation into a system prompt.
MCP resourceapplication/vnd.mcp.resource+jsonWhen the consumer is an MCP-capable agent.

Compression

Payloads larger than 32 KB are compressed with zstd level 9 and content-encoded (Content-Encoding: zstd). Clients that send Accept-Encoding: zstd always receive compressed payloads. For prompt-ready text the SDK transparently decompresses before serialization.

Redaction model

Redaction runs on the server during hydration. Policies are named, versioned, and bound to tokens via the redaction_policy claim. A policy is a list of typed rules applied in order. Field-level redactions are reflected in resume_context.redactions[] so the agent knows what was removed and why.

yaml
# default-strict policy (excerpt)
- match: { kind: "selected_text", regex: "\\b\\d{3}-\\d{2}-\\d{4}\\b" }
  action: { redact: true, reason: "PII:SSN" }

- match: { kind: "tab_url", host_in: ["banking.*", "payroll.*"] }
  action: { drop: true, reason: "domain:financial" }

- match: { kind: "reasoning_thread", contains_class: "PROTECTED_HEALTH" }
  action: { redact: true, reason: "PHI" }

- match: { kind: "ai_tags", value_in: ["confidential"] }
  action: { drop_all_with_tag: true }

Redaction is auditable

Each /hydrate call produces an audit_record with a hash of the unredacted payload, the policy version, and the list of fields removed. The unredacted payload itself is never logged.

Determinism & versioning

The serializer guarantees byte-stable output for any given (session_version, policy_version, format). The SDK exposes unihodl.context.fingerprint(ctx) — a SHA-256 over the canonical JSON form — so callers can cache hydrations and detect drift.

4. REST + MCP API Design

All HTTP endpoints are versioned at /v1/* and served from https://unihodl.app/api in v0 (with future graduation to https://api.unihodl.app). Every endpoint accepts and returns JSON unless otherwise noted. The MCP server speaks the Model Context Protocol natively over stdio in v0 and will be hosted at mcp.unihodl.app in v1.

Endpoint reference

MethodPathPurposeAuth
POST/v1/resume_tokensMint a scoped resume token bound to a session and audience.API key
GET/v1/sessions/{id}Inspect session metadata. Does not return content.API key
POST/v1/sessions/{id}/hydrateHydrate a session into a Resume Context. Requires resume token.Resume token
POST/v1/sessions/{id}/handoffTransfer a session to another principal with consent.API key + step-up
POST/v1/sessions/{id}/notesAppend an agent-authored note (write scope required).Resume token (write)
POST/v1/resume_tokens/{jti}/revokeRevoke a token before exp. Idempotent.API key
GET/.well-known/jwks.jsonPublic keys for token validation.Public
MCPmcp.unihodl.appTools: hold, resume, hand_off; Resources: unihodl://session/{`{id}`}.Resume token

MCP surface

UNIHODL exposes itself as a first-class MCP server. Agents that already speak MCP (Claude, Cursor, Cline, Continue, custom MCP clients) can mount UNIHODL with no other code:

json
{
  "mcpServers": {
    "unihodl": {
      "command": "npx",
      "args": ["-y", "@unihodl/mcp-server"],
      "env": { "UNIHODL_API_KEY": "uh_live_..." }
    }
  }
}

5. Authentication & Permissioning

UNIHODL uses a two-tier credential model. API keys identify a workspace and are used by trusted backend code to mint resume tokens, which are short-lived, scoped, audience-bound credentials sent to agents.

Credential types

TypeFormatLifetimeUse
API key (live)uh_live_…Until rotatedServer-side, mints tokens, never to agents.
API key (test)uh_test_…Until rotatedSandbox environment, no real sessions.
Resume tokeneyJhbGc…≤ 24h, 1h defaultSent to agents/MCP clients. Audience-bound.
OAuth (delegated)Bearer (RFC 6749)ConfigurableThird-party apps acting on behalf of users.

Scope catalog

ScopeGrants
read:contextTabs, scroll positions, media timestamps.
read:reasoningThe reasoning_thread (intent, blockers, next step).
read:redactedSurfaces redacted-but-acknowledged metadata (field paths only).
read:transcriptFull media transcripts when present.
write:notesAppend agent-authored notes to a session.
write:next_stepMutate the next_step intent on the reasoning thread.
session:hand_offInitiate a handoff to another principal (requires step-up).

Step-up authentication

Mutating endpoints (handoff, write:next_step) require the human to confirm in-app within the last 5 minutes. The SDK exposes session.requireStepUp() to trigger the prompt.

6. Human Reasoning State Model

Most context-passing schemes lose the why. Tab lists tell an agent what the human looked at, not what they were thinking. UNIHODL’s reasoning thread is a typed graph of nodes captured during HOLD by the on-device tagger, optionally enriched server-side, and serialized for agent consumption.

Node taxonomy

kindMeaning
observationA factual datum extracted from a source (tab, video, transcript).
partial_conclusionA claim the human is forming but has not committed to.
decision_stanceWhere the human currently leans on a pending decision.
blockerAn unresolved dependency stopping forward motion.
questionAn open question the human is investigating.
next_stepThe intended next action — agents should ground tool calls here.
referenceA pointer back into tabs[] or media[] for evidence.

Wire format

json
{
  "reasoning_thread": [
    {
      "id": "rn_01",
      "kind": "observation",
      "text": "REST batching wins on read-heavy endpoints.",
      "evidence": ["tabs[0]", "tabs[2]"],
      "confidence": 0.78
    },
    {
      "id": "rn_02",
      "kind": "partial_conclusion",
      "text": "Pure GraphQL is overkill for our workload.",
      "supports": ["rn_01"],
      "confidence": 0.66
    },
    {
      "id": "rn_03",
      "kind": "decision_stance",
      "text": "Leaning hybrid: GraphQL for write paths, REST for hot reads.",
      "supports": ["rn_02"]
    },
    {
      "id": "rn_04",
      "kind": "blocker",
      "text": "Need finance approval for vendor B's cap on Tier-2 throughput."
    },
    {
      "id": "rn_05",
      "kind": "next_step",
      "text": "Draft hybrid RFC; meet platform team Thursday.",
      "depends_on": ["rn_03"]
    }
  ]
}

Prompt-ready serialization

When X-Unihodl-Format: prompt-ready is requested, the reasoning thread is rendered as a structured natural-language block that LLMs reliably parse:

bash
## CONTEXT (UNIHODL session ses_8f3aZ91b · 2026-05-05)

The user is researching: "GraphQL migration for API v3."

What they have concluded:
  • REST batching wins on read-heavy endpoints. (confidence 0.78)
  • Pure GraphQL is overkill for our workload. (confidence 0.66)

Where they currently lean:
  → Hybrid: GraphQL for write paths, REST for hot reads.

Open blockers:
  ! Finance approval for vendor B's Tier-2 cap.

Intended next step:
  → Draft hybrid RFC; meet platform team Thursday.

Continue from here.

7. Reference: Anthropic Claude

Claude integrates with UNIHODL in two flavors: (a) system prompt injection — the simplest path, hydrate and pass the prompt-ready text as the system prompt — or (b) MCP server mounting, where Claude’s tool loop calls UNIHODL on demand.

Mode A — System prompt injection

python
import os
from unihodl_agent import Client
from anthropic import Anthropic

uh = Client(api_key=os.environ["UNIHODL_API_KEY"])
claude = Anthropic()

# Mint an audience-bound, single-use token for this turn.
token = uh.resume_tokens.create(
    session_id="ses_8f3aZ91b",
    audience="claude.anthropic.com",
    scopes=["read:context", "read:reasoning"],
    ttl_seconds=600,
    max_hydrations=1,
)

# Hydrate; ask for prompt-ready format.
ctx = uh.sessions.hydrate(
    session_id="ses_8f3aZ91b",
    token=token,
    format="prompt-ready",
)

resp = claude.messages.create(
    model="claude-sonnet-4-7",
    max_tokens=4096,
    system=ctx.text,
    messages=[
        {"role": "user", "content": "Continue Sarah's research and draft the hybrid RFC."}
    ],
)
print(resp.content[0].text)

Mode B — MCP server mount

Claude Desktop and Claude API both speak MCP. Mount UNIHODL once, then Claude can hold, resume, and hand off sessions on its own.

8. Reference: Google Gemini

Gemini integrates via function calling on the google-genaiSDK. Declare UNIHODL’s resume tool, register it with the model, and Gemini will call it when its reasoning needs the human’s prior context.

9. Reference: OpenAI Agents SDK & LangGraph

The OpenAI Agents SDK supports first-class tools. Wrap UNIHODL hydration as a tool and pass it to your agent — the agent decides when to fetch context. In LangGraph, hydrate UNIHODL once at graph entry and inject the Resume Context into the shared state.

Cross-framework guarantee

The Resume Context schema is identical across all integrations. An agent built on Claude can hand a session off to a Gemini-based agent, and the receiving agent sees the same fields, the same reasoning thread, the same evidence pointers.

10. Errors, Rate Limits & Versioning

Error envelope

json
HTTP/1.1 403 Forbidden
Content-Type: application/json

{
  "error": {
    "code": "audience_mismatch",
    "message": "Token audience claude.anthropic.com does not match request origin.",
    "audit_id": "aud_01HW7KQ9...",
    "hint": "Mint a new resume_token with the correct aud.",
    "doc_url": "https://unihodl.app/sdk/spec#errors"
  }
}

Error codes

codeHTTPDescription
invalid_token401Signature, exp, or nbf failed verification.
audience_mismatch403Token aud does not match calling origin.
scope_insufficient403Required scope is not present on the token.
session_not_found404session_id is unknown or has been deleted.
hydration_exhausted429max_hydrations cap reached for this token.
rate_limited429Per-workspace or per-token rate limit exceeded.
redaction_failed422Redaction policy could not be applied; see hint.
step_up_required401Mutating call requires recent in-app human confirmation.
payload_too_large413Estimated context exceeds requested max_tokens.

Rate limits

Endpoint groupLimitNotes
Token mint300 / minute / workspaceBursts to 600; enforced via leaky bucket.
Hydration60 / minute / tokenIndependent of token's max_hydrations cap.
Hydration10,000 / day / workspaceEnterprise plans negotiate higher limits.
Webhook delivery20 / second / endpointBeyond this, UNIHODL backs off exponentially.

Versioning

API path is the major version (/v1). Schema evolutions ship as additive minor versions surfaced in the version field of every payload. SDKs follow semver. Breaking changes require a 12-month deprecation window and a new path prefix.

Continue building. Sample apps and OpenAPI spec land at unihodl.app/sdk. Spec questions: developers@unihodl.app.

Open gaps tracked at /sdk/roadmap.