Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreigedslcuk5vs6qktyxn2b5nga5a5yk2h2k65ty7wf5vacxh3wm46m",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3moini5z4mkf2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreihwwu3ajlfijjnqvfpzwz5mrxrpr222vgqm4yzruvsptjkwpss7zu"
    },
    "mimeType": "image/webp",
    "size": 277534
  },
  "path": "/admilsoncossa/ai-agent-scopes-and-tool-lifecycles-14if",
  "publishedAt": "2026-06-17T15:25:52.000Z",
  "site": "https://dev.to",
  "tags": [
    "ai",
    "webdev",
    "programming",
    "node",
    "The sixth made it observable",
    "https://www.npmjs.com/package/@workit/core",
    "https://github.com/WorkRuntime/workit",
    "https://github.com/WorkRuntime/workit/blob/main/articles/07-agent-scope-and-tool-lifecycles.md",
    "@workit"
  ],
  "textContent": "_Five articles built the runtime. The sixth made it observable. This one introduces the agent primitive:`runAgent` plus `AgentScope`, with budgets, replayable events, and structured cancellation in the box._\n\nThe whole loop:\n\n\n\n    import { runAgent, AgentToolCalls, OpenAITokens } from \"@workit/core/ai\";\n    import { CostBudget, run } from \"@workit/core\";\n\n    const { result, events } = await runAgent(async (agent, ctx) => {\n      const plan = await agent.tool(\"plan\", goal, planLLM,\n        { tokens: 600, cost: 0.001, retry: 2 });\n\n      for (const step of plan.steps) {\n        await agent.tool(step.tool, step.input, tools[step.tool],\n          { tokens: 1_200, toolCalls: 1, timeout: \"10s\" });\n      }\n\n      return await agent.tool(\"synthesize\", workspace, synthesizeLLM,\n        { tokens: 2_000, cost: 0.004 });\n    });\n\n\nThat call returns two things — `result` (whatever the body returned) and `events` (the **complete, ordered, type-discriminated trace** of the run). No external tracing setup. No DSL. The body is plain `async`/`await`, the tools are plain functions, and every `agent.tool(...)` call is a typed primitive whose budget, retry, and timeout policy live in the call site.\n\nThis is the practical lifecycle primitive between _\"I wired up an LLM call and a tool router\"_ and _\"I can explain, bound, cancel, and replay the run.\"_\n\n##  The contract `agent.tool(name, input, fn, opts)`\n\n\n    interface AgentScope {\n      readonly id: string;\n      readonly events: readonly AgentEvent[];\n      tool<I, O>(\n        name: string,\n        input: I,\n        fn:   (input: I, ctx: TaskContext) => O | Promise<O>,\n        opts: AgentToolOptions,\n      ): Promise<O>;\n    }\n\n    interface AgentToolOptions {\n      tokens:    number;       // charged against OpenAITokens budget\n      cost:      number;       // charged against CostBudget budget\n      toolCalls: number;       // charged against AgentToolCalls budget\n      retry:     number | RetryOpts;\n      timeout:   Duration;\n    }\n\n\nFive things to notice:\n\n  * **The tool function is a plain`(input, ctx) => Promise<O>`.** No generators. No effect type. No \"tool description JSON schema\" to feed an LLM -- that's your application's job, not the runtime's.\n  * **Budgets are charged before the call returns.** Overrun rejects synchronously and cancels the owning scope with `CancelReason { kind: \"budget\", budgetKey, limit, spent }`. Runtime budget accounting stops at the cap.\n  * **`retry`/`timeout` are per-tool**, composing with the same engine described in articles 02 and 05.\n  * **`ctx.signal` inside the tool body is linked to the parent scope.** Client disconnects, deadline fires, sibling fails — all aborts propagate into the tool body so its `fetch` / `db.query` / `provider.call` aborts at the I/O boundary.\n  * **`agent.events` is a readonly buffer** that mirrors the event stream. After the run, `events` is a replayable log of the whole loop.\n\n\n\n##  A 50-cent agent with a hard tool-call cap\n\n\n    import { runAgent, AgentToolCalls, OpenAITokens } from \"@workit/core/ai\";\n    import { CostBudget, run } from \"@workit/core\";\n\n    await run.context.with(CostBudget,      { spent: 0, limit: 0.50,    unit: \"USD\" },\n    () => run.context.with(OpenAITokens,    { spent: 0, limit: 100_000, unit: \"tokens\" },\n    () => run.context.with(AgentToolCalls,  { spent: 0, limit: 20,      unit: \"tool_calls\" },\n      () => runAgent(async (agent) => reactLoop(agent, goal)),\n    )));\n\n\nThree caps, three reasons:\n\nBudget | What it bounds | What overrun does\n---|---|---\n`CostBudget` | Aggregate USD across the whole run | Rejects with `BudgetExceededError` and cancels the owning scope. The 32 inflight LLM/tool calls see the abort on `ctx.signal`; provider-side billing depends on the provider honoring cancellation.\n`OpenAITokens` | Total tokens across all LLM calls | Same shape. Use a dedicated key per provider when you want separate caps.\n`AgentToolCalls` | Total tool calls -- fan-out limiter | Stops a runaway agent from invoking tools forever. Bench 19-B caps it at 1 and the second tool call fails closed.\n\n> **Bench`19-agent-scope.mjs`.** Five scenarios -- measured.\n>\n> # | Scenario | Result\n> ---|---|---\n> A | Tool events bracket execution | Single `agent.tool(\"calc\", 3, x => x*x)` call -> 4 events `[agent:started, agent:tool_started, agent:tool_succeeded, agent:completed]`, sequential `seq: [1,2,3,4]`, monotonic `at`, stable `agentId`.\n> B |  `AgentToolCalls` cap hit |  `limit: 1`. Second call rejects with `BudgetExceededError`, `budgetKey: \"AgentToolCalls\"`, `limit: 1`.\n> C |  `OpenAITokens` charged via opts |  `{ tokens: 50 }` then `{ tokens: 25 }` -> final `spent: 75` exactly.\n> D | Parent scope cancel during tool |  `ctx.scope.cancel({ kind: \"manual\", tag: \"user-stop\" })` mid-tool -> tool body's `ctx.signal` aborts, outer settles `CancellationError` with `reason.kind: \"manual\"`, `tag: \"user-stop\"`.\n> E | Replayable log, 3-tool run | 8 events: `started -> (tool_started -> tool_succeeded) x 3 -> completed`. Seq `[1..8]`. Same agentId. Tool names captured in order.\n\n##  Replayable events -- the typed trace\n\n\n    type AgentEvent =\n      | { type: \"agent:started\";        seq: number; agentId: string; at: number }\n      | { type: \"agent:tool_started\";   seq: number; agentId: string; tool: string; at: number }\n      | { type: \"agent:tool_succeeded\"; seq: number; agentId: string; tool: string; at: number }\n      | { type: \"agent:tool_failed\";    seq: number; agentId: string; tool: string; error: string; at: number }\n      | { type: \"agent:tool_cancelled\"; seq: number; agentId: string; tool: string; reason: CancelReason; at: number }\n      | { type: \"agent:completed\";      seq: number; agentId: string; at: number }\n      | { type: \"agent:failed\";         seq: number; agentId: string; error: string; at: number };\n\n\nSeven variants. Discriminated by `type`. Every variant carries `seq` and `at`. Cancelled events carry the typed `CancelReason`.\n\nWhat you can do with that:\n\n  * **Pivot a dashboard** on `tool` x `type` for failure heatmaps without parsing logs.\n  * **Replay a run** in a test by walking the events array -- you have the order, the names, the timing.\n  * **Audit a charge** by reconstructing the budget timeline from `tool_succeeded` events tagged with the tokens / cost charged at the call site.\n  * **Diff two runs** on the event sequence to see exactly which tool path diverged.\n\n\n\nThe events array on the `AgentRunResult` is `readonly` and mirrors the same event stream that flows through `scope.onEvent(...)` -- so live observers see the same shape the post-run audit log sees.\n\n##  How does this compare\n\nStack | Tool primitive | Budget primitive | Replayable event log | Scope cancellation | Bundle\n---|---|---|---|---|---\n**WorkIt`runAgent`** | yes typed `(input, ctx) => O` | yes `CostBudget` / `OpenAITokens` / `AgentToolCalls` / `createBudget(...)` composable | yes `AgentRunResult.events` typed union | yes `ctx.signal` aborts each tool body | included in `@workit/core/ai` (~8 KB gzip with the rest of `/ai`)\nLangChain agents | yes but typed loosely; many tools as JSON | no no first-class budget primitive | partial via callbacks | no no scope tree | ~hundreds of KB\nVercel AI SDK | yes tool schemas | no no first-class budget | events on stream | yes via `AbortSignal`, no scope tree | medium\nMastra | yes generators-based | partial | yes trace store | yes | medium\nRoll-your-own with `for`-loop + `fetch` | yes, by definition | DIY | DIY | DIY | minimal but you wrote the runtime\n\nThe design point: **the agent primitive composes with the same`CancelReason`, `ctx.signal`, `defer`, budget, and `scope.tree()` machinery from articles 01-06**. There is no second runtime. You don't choose between \"the agent loop's lifecycle\" and \"the rest of your app's lifecycle\" -- they share one tree.\n\n##  A complete, runnable example\n\n\n    import { runAgent, AgentToolCalls, OpenAITokens } from \"@workit/core/ai\";\n    import { CostBudget, run, renderTree } from \"@workit/core\";\n\n    const tools = {\n      search: async ({ q }, ctx) =>\n        fetch(`https://api.search.dev/q=${q}`, { signal: ctx.signal }).then(r => r.json()),\n\n      fetchPage: async ({ url }, ctx) =>\n        fetch(url, { signal: ctx.signal }).then(r => r.text()),\n\n      summarize: async ({ text }, ctx) =>\n        openai.chat({ messages: [{ role: \"user\", content: `tl;dr: ${text}` }] },\n                    { signal: ctx.signal }),\n    };\n\n    const { result, events } = await run.context.with(\n      CostBudget, { spent: 0, limit: 0.50, unit: \"USD\" },\n      () => run.context.with(\n        AgentToolCalls, { spent: 0, limit: 12, unit: \"tool_calls\" },\n        () => runAgent(async (agent) => {\n          const hits = await agent.tool(\"search\",\n            { q: \"structured concurrency typescript\" }, tools.search,\n            { toolCalls: 1, timeout: \"5s\", retry: 2 });\n\n          const docs = await Promise.all(hits.slice(0, 3).map((hit, i) =>\n            agent.tool(`fetchPage[${i}]`,\n              { url: hit.url }, tools.fetchPage,\n              { toolCalls: 1, timeout: \"10s\" })));\n\n          return await agent.tool(\"summarize\",\n            { text: docs.join(\"\\n\\n\") }, tools.summarize,\n            { tokens: 4_000, cost: 0.02, toolCalls: 1, timeout: \"30s\" });\n        }),\n      ),\n    );\n\n    console.log(result);\n    console.log(events.map(e =>\n      `${e.seq.toString().padStart(2)} ${e.type}${\"tool\" in e ? ` (${e.tool})` : \"\"}`,\n    ).join(\"\\n\"));\n\n\nThat's an agent that searches, fetches three pages, summarises, and stops at 50 cents or 12 tool calls -- whichever comes first. Cancel the parent scope and every in-flight `fetch` and LLM stream aborts at the TCP layer. No manual `AbortController` plumbing. No \"did I forget to thread the signal.\" No `try/catch` around the agent loop.\n\n##  Receipts\n\n\n    node benchmarks/articles/19-agent-scope.mjs           # 5 contract scenarios\n    node benchmarks/articles/run-all.mjs                  # full 19-bench suite\n\n\nProduction-side gates that back the same surface:\n\nClaim | Evidence\n---|---\nTool events bracket execution with monotonic seq |  `19-agent-scope.mjs` A verifies four ordered events, sequential `seq`, stable `agentId`, and monotonic `at`.\n`AgentToolCalls` overflow rejects with `BudgetExceededError` | Bench 19 B sets `limit: 1`; the second tool call throws with `budgetKey: \"AgentToolCalls\"`.\n`OpenAITokens` consumed via `{ tokens: N }` | Bench 19 C verifies the final token budget `spent` is exactly `75`.\nParent scope cancel propagates into tool body | Bench 19 D verifies the tool body observes abort and the outer scope settles with the original manual reason.\nReplayable, ordered, typed event log | Bench 19 E verifies eight events, sequential `seq`, monotonic `at`, and tool names in call order.\nTool failure surfaces as `agent:tool_failed` | Unit coverage verifies tool errors propagate and are captured in the typed event log.\n\n##  Closing The Series\n\nThe important part is not that WorkIt has an agent helper. The important part is\nthat the agent helper is not a second runtime. Tool calls, token budgets,\ntimeouts, retries, cancellation, progress events, and cleanup all use the same\nownership tree as the rest of the library.\n\nThe public claims behind this series are tracked in\n`evidence/claims.json`, exercised by\n`npm run test:evidence`, and benchmarked by `npm run bench:articles`. The prose\nis intentionally not the evidence store; it is the readable path through the\nengineering tradeoffs.\n\n##  Source, Benchmarks, And Evidence\n\n  * NPM: https://www.npmjs.com/package/@workit/core\n  * Source: https://github.com/WorkRuntime/workit\n  * Article source: https://github.com/WorkRuntime/workit/blob/main/articles/07-agent-scope-and-tool-lifecycles.md\n  * Reproduce: `npm run bench:articles` and `npm run test:evidence`\n\n",
  "title": "AI Agent Scopes And Tool Lifecycles"
}