Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreied6h3imxaanqcgctwyr4zr6x4c7hq6t5cmrgas2ucae35gpagl3u",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mpnrjahbydk2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreia4ue676l2wq5iswdcfweexiskpha5hix76em3cbsbgdieu55m7me"
    },
    "mimeType": "image/webp",
    "size": 119812
  },
  "path": "/walletguy/683-test-files-later-how-we-validate-ai-agent-wallet-infrastructure-1nn1",
  "publishedAt": "2026-07-02T09:38:24.000Z",
  "site": "https://dev.to",
  "tags": [
    "testing",
    "typescript",
    "architecture",
    "opensource",
    "https://github.com/minhoyoo-iotrust/WAIaaS",
    "https://waiaas.ai",
    "@waiaas"
  ],
  "textContent": "683 Test Files Later: How We Validate AI Agent Wallet Infrastructure\n\nYour AI agent can browse the web, write code, and manage files — but can it actually touch money? That's the gap WAIaaS was built to close: a self-hosted, open-source Wallet-as-a-Service that gives your AI agent a real blockchain wallet, a policy engine, and a transaction pipeline it can use autonomously. And before any of that ships to production, it has to pass more than 683 test files.\n\n##  Why Test Coverage Matters for Wallet Infrastructure\n\nWhen your agent sends an email, a bug means a bad email. When your agent sends 0.1 ETH to the wrong address, a bug means lost funds. The stakes are categorically different.\n\nThis isn't about chasing a coverage number. It's about the fact that wallet infrastructure for AI agents sits at the intersection of two unforgiving domains: financial transactions (irreversible, high-stakes) and autonomous software (runs without human review). If you're building an agent on top of a wallet layer, you need to know that layer has been beaten up extensively before you trust it with real assets.\n\nHere's a practical look at what WAIaaS actually tests, and more importantly, what that means for you as a developer building on top of it.\n\n##  The Architecture Under Test\n\nWAIaaS is a 15-package monorepo. Each package has its own test suite, and together they cover every layer of the system an AI agent will touch.\n\n\n\n    actions, adapters, admin, cli, core, daemon, desktop-spike,\n    e2e-tests, mcp, openclaw-plugin, push-relay, sdk, shared, skills, wallet-sdk\n\n\nThat's 683+ test files spread across packages that include:\n\n  * **The transaction pipeline** — a 7-stage pipeline covering validate, auth, policy, wait, execute, and confirm\n  * **The policy engine** — 21 policy types and 4 security tiers\n  * **45 MCP tools** — every tool your Claude or LangChain agent will call\n  * **15 DeFi protocol integrations** — including Jupiter, Aave v3, Hyperliquid, and more\n  * **39 REST API route modules** — every endpoint the SDK talks to\n\n\n\nWhen you call `client.sendToken()` from the TypeScript SDK, you're exercising code that has been tested at the unit level, the integration level, and the pipeline level. Let's walk through each of those layers.\n\n##  Layer 1: The Transaction Pipeline\n\nEvery transaction an AI agent submits goes through a 7-stage pipeline:\n\n  1. **stage1-validate** — schema validation and chain-specific checks\n  2. **stage2-auth** — session token verification\n  3. **stage3-policy** — policy engine evaluation against all active policies\n  4. **stage4-wait** — handles DELAY and APPROVAL tier transactions\n  5. **stage5-execute** — signs and broadcasts to the network\n  6. **stage6-confirm** — monitors for on-chain confirmation\n\n\n\nThis pipeline is what stands between your agent's intent and an actual blockchain transaction. The test suite covers every stage, including the unhappy paths: what happens when a policy blocks a transaction, what happens when a DELAY times out, what happens when broadcast fails and needs to be retried.\n\nFrom your agent's perspective, this pipeline is transparent. You submit a transaction and get back a status. But knowing it's there — and tested — is what lets you trust the output.\n\nHere's a basic send from the TypeScript SDK:\n\n\n\n    import { WAIaaSClient, WAIaaSError } from '@waiaas/sdk';\n\n    const client = new WAIaaSClient({\n      baseUrl: process.env['WAIAAS_BASE_URL'] ?? 'http://localhost:3100',\n      sessionToken: process.env['WAIAAS_SESSION_TOKEN'],\n    });\n\n    // Step 1: Check wallet balance\n    const balance = await client.getBalance();\n    console.log(`Balance: ${balance.balance} ${balance.symbol} (${balance.chain}/${balance.network})`);\n\n    // Step 2: Send tokens\n    const sendResult = await client.sendToken({\n      type: 'TRANSFER',\n      to: 'recipient-address',\n      amount: '0.001',\n    });\n    console.log(`Transaction submitted: ${sendResult.id} (status: ${sendResult.status})`);\n\n    // Step 3: Poll for confirmation\n    const POLL_TIMEOUT_MS = 60_000;\n    const startTime = Date.now();\n    while (Date.now() - startTime < POLL_TIMEOUT_MS) {\n      const tx = await client.getTransaction(sendResult.id);\n      if (tx.status === 'COMPLETED') {\n        console.log(`Transaction confirmed! Hash: ${tx.txHash}`);\n        break;\n      }\n      if (tx.status === 'FAILED') {\n        console.error(`Transaction failed: ${tx.error}`);\n        break;\n      }\n      await new Promise(resolve => setTimeout(resolve, 1000));\n    }\n\n\nThe pattern is simple because the pipeline complexity is encapsulated. Your agent doesn't need to know about stage3-policy or stage4-wait. It just polls for `COMPLETED`.\n\n##  Layer 2: The Policy Engine\n\nThe policy engine is probably the most critical thing to get right, and it's the part of WAIaaS with the most test surface area.\n\n21 policy types. 4 security tiers. Default-deny enforcement. A single misconfigured policy could either block legitimate agent transactions or, worse, allow a transaction that should have required human approval.\n\nThe 4 tiers tell you exactly what will happen to a transaction:\n\n\n\n    INSTANT   — Execute immediately, no notification\n    NOTIFY    — Execute immediately, send notification\n    DELAY     — Queue for delay_seconds, then execute (cancellable)\n    APPROVAL  — Require human approval via WalletConnect/Telegram/Push\n\n\nTesting this correctly means verifying every boundary condition. An `instant_max_usd` of $10 means a $10.00 transaction is INSTANT and a $10.01 transaction is NOTIFY. Those boundary tests exist in the suite.\n\nHere's what a spending limit policy looks like when you create it via the REST API:\n\n\n\n    curl -X POST http://127.0.0.1:3100/v1/policies \\\n      -H \"Content-Type: application/json\" \\\n      -H \"X-Master-Password: my-secret-password\" \\\n      -d '{\n        \"walletId\": \"<wallet-uuid>\",\n        \"type\": \"SPENDING_LIMIT\",\n        \"rules\": {\n          \"instant_max_usd\": 100,\n          \"notify_max_usd\": 500,\n          \"delay_max_usd\": 2000,\n          \"delay_seconds\": 900,\n          \"daily_limit_usd\": 5000\n        }\n      }'\n\n\nAnd if a transaction is blocked, the error response is structured and actionable:\n\n\n\n    {\n      \"error\": {\n        \"code\": \"POLICY_DENIED\",\n        \"message\": \"Transaction denied by SPENDING_LIMIT policy\",\n        \"domain\": \"POLICY\",\n        \"retryable\": false\n      }\n    }\n\n\nYour agent can catch this, log it, and surface it to a human rather than silently failing. The test suite covers both the policy evaluation logic and the error response format, so you can build reliable error handling on top of a stable contract.\n\n##  Layer 3: The MCP Tools\n\nIf you're building on Claude or another MCP-compatible framework, your agent interacts with WAIaaS through 45 MCP tools. Every one of those tools is a tested surface.\n\nThe tool list covers the full range of what an autonomous agent might need:\n\n  * **Wallet operations** : `get-balance`, `get-address`, `get-assets`, `get-wallet-info`\n  * **Transactions** : `send-token`, `send-batch`, `sign-transaction`, `simulate-transaction`\n  * **DeFi** : `action-provider`, `get-defi-positions`, `get-health-factor`\n  * **NFTs** : `get-nft-metadata`, `list-nfts`, `transfer-nft`\n  * **Protocol-specific** : `hyperliquid`, `polymarket`, `x402-fetch`\n  * **Security/auth** : `erc8004-get-reputation`, `wc-connect`, `list-sessions`\n\n\n\nSetting up MCP with Claude Desktop takes one command:\n\n\n\n    waiaas mcp setup --all    # Auto-register all wallets with Claude Desktop\n\n\nOr you can configure it manually in `claude_desktop_config.json`:\n\n\n\n    {\n      \"mcpServers\": {\n        \"waiaas\": {\n          \"command\": \"npx\",\n          \"args\": [\"-y\", \"@waiaas/mcp\"],\n          \"env\": {\n            \"WAIAAS_BASE_URL\": \"http://127.0.0.1:3100\",\n            \"WAIAAS_SESSION_TOKEN\": \"wai_sess_<your-token>\",\n            \"WAIAAS_DATA_DIR\": \"~/.waiaas\"\n          }\n        }\n      }\n    }\n\n\nAfter that, Claude can call `get_balance`, `send_token`, or `execute_action` the same way it calls any other tool — but now the infrastructure behind those calls has been validated by 683+ test files.\n\n##  Layer 4: The DeFi Protocol Integrations\n\n15 DeFi protocol providers are integrated in WAIaaS:\n\n\n\n    aave-v3, across, dcent-swap, drift, erc8004, hyperliquid,\n    jito-staking, jupiter-swap, kamino, lido-staking, lifi,\n    pendle, polymarket, xrpl-dex, zerox-swap\n\n\nEach provider has its own action logic. Testing these means mocking RPC calls, simulating swap quotes, and verifying that the action payload built for Jupiter looks correct before it ever hits mainnet.\n\nThere's also a dry-run capability built into the transaction pipeline. Before your agent executes a DeFi action for real, it can simulate it:\n\n\n\n    curl -X POST http://127.0.0.1:3100/v1/transactions/send \\\n      -H \"Content-Type: application/json\" \\\n      -H \"Authorization: Bearer wai_sess_<token>\" \\\n      -d '{\n        \"type\": \"TRANSFER\",\n        \"to\": \"recipient-address\",\n        \"amount\": \"0.1\",\n        \"dryRun\": true\n      }'\n\n\nThis is a first-class feature of the API, not a workaround. You can build agents that dry-run before executing and only proceed if the simulation succeeds.\n\n##  Quick Start: Running It Yourself\n\nYou don't need to take our word for the test coverage. You can clone the repo and run the suite locally. But if you want to get an agent connected first:\n\n**Step 1 — Install the CLI and start the daemon**\n\n\n\n    npm install -g @waiaas/cli\n    waiaas init\n    waiaas start\n\n\n**Step 2 — Create wallets and sessions in one command**\n\n\n\n    waiaas quickset --mode mainnet\n\n\n**Step 3 — Connect to Claude Desktop**\n\n\n\n    waiaas mcp setup --all\n\n\n**Step 4 — Or use the TypeScript SDK directly**\n\n\n\n    npm install @waiaas/sdk\n\n\n\n    import { WAIaaSClient } from '@waiaas/sdk';\n\n    const client = new WAIaaSClient({\n      baseUrl: 'http://127.0.0.1:3100',\n      sessionToken: process.env.WAIAAS_SESSION_TOKEN,\n    });\n\n    const balance = await client.getBalance();\n    console.log(`${balance.balance} ${balance.symbol}`);\n\n\n**Step 5 — Set a policy before going to mainnet**\n\nUse the policy API to configure a spending limit before you let your agent run autonomously. Even a simple `SPENDING_LIMIT` with a $10 instant max gives you a meaningful safety net.\n\n##  What 683 Test Files Actually Tells You\n\nA number like 683 test files is only meaningful in context. Here's the context:\n\n  * The transaction pipeline has stages that are individually tested — including the gas condition stage that holds transactions until gas price meets a threshold\n  * The policy engine covers all 21 policy types and their boundary conditions\n  * The 3 auth methods (masterAuth with Argon2id, ownerAuth with SIWS/SIWE, sessionAuth with JWT HS256) are each tested independently\n  * The 45 MCP tools are tested against the same transaction and policy infrastructure\n  * The OpenAPI 3.0 spec is auto-generated and available at `/doc`, with an interactive reference UI at `/reference` — so the API you're coding against is validated against the implementation\n\n\n\nNone of this means bugs don't exist. It means the team has invested in the kind of validation infrastructure that gives you a reasonable basis for trust when you're building financial tooling for autonomous agents.\n\n##  What's Next\n\nIf you want to go deeper on how the policy engine works in practice — especially the default-deny behavior and how to configure policies for different agent risk profiles — that's worth its own read. The security model has three distinct layers (session auth, time delay and approval, monitoring and kill switch) that work together in ways that aren't obvious from the policy API alone.\n\nThe best next step is to get a local instance running and connect it to your agent. The CLI makes that fast, and the MCP integration means you can be talking to a real wallet from Claude Desktop in under ten minutes.\n\n  * GitHub: https://github.com/minhoyoo-iotrust/WAIaaS\n  * Official site: https://waiaas.ai\n\n",
  "title": "683 Test Files Later: How We Validate AI Agent Wallet Infrastructure"
}