{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreib2ow3kjvjmsuv5j7qtrjltxbu43orugwzmj34ceb3a4mvarzfhci",
"uri": "at://did:plc:ewahtt65dq4k2bciturk5do6/app.bsky.feed.post/3mjlmomgrxsj2"
},
"description": "An executive demands an AI strategy in six weeks, but a deck without data is just theory. Learn how one PM used radical constraints to ship a single AI capability in 30 days, proving that real usage data from two users is more valuable than a perfect plan.",
"path": "/shipping-your-first-ai-agent-in-30-days-a-guide-for-product-managers/",
"publishedAt": "2026-04-16T04:56:31.000Z",
"site": "https://www.thoughtmunchies.com",
"tags": [
"Should You Build AI Agent Capabilities? A Decision Framework for Product ManagersYour CEO asks: “Should we become an agent capability?” You have one quarter, one team, finite resources. Build for AI agents or ship features your customers asked for? The five dimensions that determine which path fits your product.Thought Munchies - Systems Thinking for Product & EngineeringPrashant Bhargava",
"Five Dimensions Decision Framework to build Agent Capabilities",
"add constraints to force focus"
],
"textContent": "## How Maya Shipped Her First Agent Capability in 30 Day\n\nMaya had been a PM at DealIO for three years. The product helped sales teams manage contracts. Send them. Track signatures. Close deals. Nothing glamorous, but 40,000 companies used it daily.\n\nShe'd been a PM long enough to recognize the setup. Her CEO's Slack message arrived at 7:43 AM with the subject line \"Thoughts?\" and a link to Apple's Intelligence announcement. No preamble. No context. Just the question hanging there, waiting for an answer she didn't have.\n\nBy 9 AM, her head of sales had forwarded the same link with a different question: \"Do we work with AI agents?\"\n\nBy noon, her calendar showed a new meeting: \"Agent Strategy Discussion\" in six weeks.\n\nSix weeks to present a strategy for something she'd been thinking about for six days.\n\n* * *\n\n## The Constraint Audit\n\nMaya opened a new doc and started listing what she had:\n\n * Engineering team underwater with Q2 roadmap commitments\n * No budget for new headcount\n * 47 product features, any of which could theoretically be \"agent-enabled\"\n * Six weeks until executives expected answers\n\n\n\nWhat she didn't have: time to analyze her way to the right answer.\n\nShould You Build AI Agent Capabilities? A Decision Framework for Product ManagersYour CEO asks: “Should we become an agent capability?” You have one quarter, one team, finite resources. Build for AI agents or ship features your customers asked for? The five dimensions that determine which path fits your product.Thought Munchies - Systems Thinking for Product & EngineeringPrashant Bhargava\n\nShe'd read about the Five Dimensions Decision Framework to build Agent Capabilities—revenue model, customer behavior, risk profile, moat location, time horizon. DealIO checked most of the boxes. Enterprise customers already approved software through IT. Contracts involved sensitive data but read operations were low risk. The pricing model was seat-based, which wasn't perfect for agent usage, but it wasn't a dealbreaker.\n\nThe analysis said \"maybe.\" The only way to turn \"maybe\" into \"yes\" or \"no\" was to ship something real.\n\nMaya sent a message to her senior engineer: \"Got 30 minutes tomorrow? Want to explore something.\"\n\n* * *\n\n## Choosing the Capability\n\nMaya pulled up her customer interview notes from the past quarter. Not new interviews, just the complaints and feature requests she'd collected during normal check-ins.\n\nOne pattern kept appearing: \"Has the NDA been signed yet?\"\n\nSales teams asked it. Account managers asked it. Even customers asked it when they were waiting on contracts from their own vendors.\n\nDealIO had 47 features. Most required the full UI context to make sense—signature workflows, template builders, bulk send operations. But document status was self-contained. The product already had an internal API endpoint that powered the dashboard: `/api/v1/documents/{id}/status`. It returned a simple response: pending, signed, expired.\n\nRead-only. High value to users who asked the question daily. Self-contained. API-ready.\n\nMaya added it to her doc: \"First capability: Get document status.\"\n\nStrategic AI is built from the inside out; Maya’s \"Get Status\" endpoint is the smallest, highest-value doll that makes the entire agentic workflow possible. By isolating this self-contained, API-ready signal, the broad vision of an \"AI Strategy\" finally gains a concrete foundation to stand on.\n\nNow she needed to know if anyone would actually use it.\n\n## Sign up for Thought Munchies\n\nEssays on product decisions, systems thinking, and the human side of building software. Written from the seam between product and engineering.\n\nJoin\n\nEmail sent! Check your inbox to complete your signup.\n\nNo spam. Unsubscribe anytime.\n\n* * *\n\n## Week 1: What Would You Do With This?\n\nMaya opened a new doc. At the top, she wrote: **Desired Outcome: Sales teams close deals faster.**\n\nUnderneath, she started mapping what she knew from past conversations:\n\n * Sales reps spend 30+ min/day checking contract status\n * Deals stall when contracts sit unsigned for weeks\n * No visibility into which contracts need follow-up\n\n\n\nBut these were observations from months ago. She needed fresh data.\n\nMaya scheduled calls with five customers, but not cold interviews. She picked people she'd talked to before—customers who'd complained about contract workflows, who'd asked for automation features, who used Zapier to patch gaps in DealIO.\n\nThe first call went sideways fast.\n\n\"If an AI agent could check document status without you logging in, would you use it?\"\n\nSarah paused. \"I mean... sure? Sounds useful.\"\n\nMaya wrote down \"sure, sounds useful\" and immediately knew it was worthless. That was the answer people gave when they were being polite.\n\nShifting from \"Would you use this?\" to \"What would you do with this?\" moves the conversation from polite hypotheticals to raw workflow pain.\n\nShe tried a different approach.\n\n\"Walk me through what you did this morning when you got to work.\"\n\n\"Checked email, then logged into DealIO to see which contracts came back overnight.\"\n\n\"How long did that take?\"\n\n\"Maybe 30 minutes? I'm checking status on 15-20 contracts, opening each one.\"\n\n\"What happens if you don't check?\"\n\n\"Deals slip. A contract sits pending for two weeks and nobody notices until the customer emails asking why we haven't signed.\"\n\nNow they were talking about a real problem.\n\n\"What if you didn't have to log in? What if you could just ask 'what's pending?' and get a list?\"\n\nSarah's energy shifted. \"Oh. Well, we have an intern who does this every morning and updates a spreadsheet. If an agent could do that...\"\n\nMaya added to her doc:\n\n**Opportunity: Eliminate manual status checking**\n\n**Current solution:** Intern manually checks, updates spreadsheet (30 min/day, 15-20 contracts)\n\n**Assumption:** Users will trust an agent to access contract data\n\nBy the fifth call, Maya had refined her questions. She stopped describing solutions and started asking about desired outcomes.\n\n\"If you could wave a wand and fix one thing about contract workflows, what would it be?\"\n\n\"I'd know instantly which contracts are stuck so I can follow up before deals go cold.\"\n\n\"What would you do with that information?\"\n\n\"I'd trigger Slack reminders when deals stall past three days.\"\n\n\"We'd automate the Monday morning status report.\"\n\n\"I'd build a workflow: check status, and if it's pending after a week, escalate to the sales manager.\"\n\nMaya's doc grew:\n\n**Opportunity: Proactive follow-up on stalled contracts**\n\n**Assumptions to test:**\n\n * Users will grant read access to contract data\n * Status check without UI login provides value\n * Integration with existing tools (Slack, email) is necessary\n * 3-5 day stall threshold triggers action\n\n\n\nFour of the five customers described the same pain: manually checking status was eating 20-30 minutes daily. All four had built workarounds—spreadsheets, interns, calendar reminders.\n\nBut one customer—Alex—said something different: \"Honestly, the bigger problem is that I don't know _why_ contracts are stuck. Status tells me it's pending, but I still have to dig through email to figure out who's blocking it.\"\n\nMaya added that to her doc under a different branch. Not for this iteration, but worth remembering.\n\nShe had her signal. Customer demand existed. They could quantify the value. And she had a clear assumption to test: would customers grant an AI agent read access to their contract data?\n\nOnly one way to find out: ship something and ask.\n\nBy Friday, her doc looked like this:\n\n\n OUTCOME: Sales teams close deals faster\n\n ├─ OPPORTUNITY: Eliminate manual status checking\n │ ├─ Current cost: 30 min/day per person\n │ ├─ Solution: \"Get document status\" capability\n │ └─ Assumption to test: Users will grant read access\n │\n ├─ OPPORTUNITY: Proactive follow-up on stalled contracts\n │ ├─ Current gap: Deals slip when contracts sit unsigned\n │ ├─ Desired behavior: Auto-trigger reminders after 3-5 days\n │ └─ Assumption to test: Integration with Slack/email necessary\n │\n └─ OPPORTUNITY: Understand why contracts stall [FUTURE]\n └─ Current gap: Status doesn't show blocker\n\n\nShe picked the first branch. One capability. Clear outcome. Testable assumption.\n\nThe rest could wait.\n\n* * *\n\n## Week 2: The Engineering Reality Check\n\nJordan pulled up a whiteboard. \"So, document status check via MCP. Let me think through what this needs...\"\n\nThe diagram grew as Jordan sketched: MCP server, API wrapper, auth layer, rate limiting, audit logs, scope validation, error handling, caching, monitoring.\n\n\"Two weeks,\" Jordan said. \"Maybe three if we're being honest.\"\n\nIt was a classic case of the _Curse of Knowledge_. Jordan was applying a hyperscale mental model to a micro-scale problem. Engineers naturally default to building for scale—anticipating multi-tenant complexities and edge cases before there's even a single tenant using the feature.\n\nMaya's stomach dropped. The executive presentation was in six weeks. She needed four weeks to gather usage data or she'd be presenting theory, not evidence.\n\n\"What if we add constraints to force focus?\" Maya asked.\n\nJordan looked up. \"Meaning?\"\n\n\"You're designing for scale. What if we don't? What if we design for... three customers, read-only, one month of runtime?\"\n\nJordan tilted their head, considering. \"So no write operations. No delete. Just reads.\"\n\n\"Just reads.\"\n\n\"And we don't need enterprise-grade auth if it's three hand-picked customers. We can use API keys instead of full OAuth, gate it behind our existing user permissions.\"\n\n\"Right. They already have access to documents through the UI. This just lets them check status via an agent.\"\n\nJordan started erasing parts of the diagram. \"No multi-tenant complexity. No delete permissions to audit. No complex rate limiting—100 calls per minute per customer is fine. We cache status for five minutes so if the API hiccups, the agent gets stale data instead of failing.\"\n\nWhile a system __can__ scale to one million users, Maya needed to ensure that the current design focus must remain pointed at the needs of the first three customers to solve the micro-scale problem effectively.\n\nThe diagram was half the size now.\n\n\"How long?\" Maya asked.\n\n\"One week. Five days if I don't hit surprises.\"\n\nMaya pulled up her calendar. \"You've got five. If it takes seven, we still have three weeks of data before the presentation.\"\n\n\"What about audit logs?\" Jordan asked. \"If we're only tracking reads from three customers, do we really need full audit trails?\"\n\nMaya thought about Sarah from the customer calls. Her team handled enterprise NDAs worth millions.\n\n\"Keep the audit logs,\" Maya said. \"If we're touching customer data, even read-only, we need to show what we accessed and when. That's not negotiable.\"\n\n\"Fair. Everything else though—caching, simple rate limiting, API keys—those are all faster.\"\n\nJordan nodded. \"One endpoint. Read-only. Three customers. I can do five days.\"\n\n\"And when they ask for more?\"\n\n\"Then we have usage data to justify building it right.\"\n\n## Sign up for Thought Munchies\n\nEssays on product decisions, systems thinking, and the human side of building software. Written from the seam between product and engineering.\n\nSubscribe\n\nEmail sent! Check your inbox to complete your signup.\n\nNo spam. Unsubscribe anytime.\n\n* * *\n\n## Week 3: Build and Test Assumptions\n\nDay three, Jordan sent a Slack: \"Basic MCP server works. Can query the endpoint, get status back. Starting on auth.\"\n\nDay five: \"Rate limiter tested. Audit logging schema done. Running final integration tests.\"\n\nDay seven: \"Shipped. Ready for your test customers.\"\n\nMaya connected her Claude to the MCP server and typed: \"What's the status of document ABC-123?\"\n\nThe response came back clean:\n\n\n {\n \"document_id\": \"ABC-123\",\n \"status\": \"pending\",\n \"signed_by\": [],\n \"pending_from\": [\"legal@acmecorp.com\"],\n \"last_updated\": \"2026-04-08T14:30:00Z\"\n }\n\n\nIt worked!\n\nMaya pulled up her assumption list:\n\n * ✓ Internal API is stable enough for agent access\n * ? Users will grant read access to contract data\n * ? Status check without UI login provides value\n * ? Integration with existing tools necessary\n\n\n\nTime to test the ones with question marks.\n\nShe sent Sarah—the customer with the intern checking status every morning—a brief message: \"Remember that contract status capability we talked about? It's live. Want to try it?\"\n\nSarah replied within minutes: \"Yes. How does it work?\"\n\nMaya sent setup instructions and scheduled a 30-minute call for the next morning.\n\nOn the call, Sarah shared her screen. She connected the MCP server to Claude, then typed: \"Show me all pending contracts.\"\n\nThe agent listed 23 documents. Sarah scanned the list. \"Wait, this one's been pending for 11 days? I had no idea.\"\n\nWithin the hour, Sarah had built her first workflow. She set up a morning automation: check all pending contracts, flag anything stuck for more than three days, post summary to Slack.\n\n\"This is incredible,\" Sarah said. Then she paused. \"Also—I love that I can see what the agent checked in the audit log. Makes me trust it.\"\n\nMaya updated her assumptions:\n\n * ✓ Users will grant read access (Sarah granted immediately, no hesitation)\n * ✓ Status check without UI provides value (found 11-day stalled contract)\n * ✓ Audit logs build trust (unexpected benefit)\n\n\n\nBut one assumption remained untested: integration necessity.\n\nTwo days later, Maya invited Alex and David to test. Both had mentioned specific integration needs during the Week 1 calls.\n\nAlex built a workflow within a day. David tested it once, then went quiet.\n\nMaya made a note: \"Follow up with David after Week 4. See what blocked adoption.\"\n\n* * *\n\n## Week 4: Validating Assumptions\n\nThree customers. Four weeks of usage. Maya reviewed her assumption list:\n\n**Tested & Validated:**\n\n * ✓ Users will grant read access (100% permission grant rate)\n * ✓ Status check without UI provides value (2 of 3 use daily)\n * ✓ Audit logs build trust (customers cite in feedback)\n\n\n\n**Partially Validated:**\n\n * ? Integration with existing tools necessary (2 yes, 1 unclear)\n\n\n\nThe metrics told part of the story:\n\n**Sarah:** 127 API calls over 28 days. Built two workflows (morning status check, weekly report). Saved 30 minutes daily. Feedback: \"When are you enabling more tools through the MCP?\"\n\n**Alex:** 89 API calls. Built one workflow: weekly status compilation that used to take an hour manually. Feedback: \"This should integrate with our CRM. Right now I'm copying data between tools.\"\n\n**David:** 1 API call. Day one. Never returned.\n\nBut metrics didn't explain _why_ David stopped using it.\n\nMaya scheduled a call. \"What happened? You tested it once and never came back.\"\n\n\"It works great,\" David said. \"But I need it in my CRM, not in Claude. I'm not switching tools to check contract status. If it could feed data directly into Salesforce, I'd use it constantly.\"\n\nMaya updated her doc:\n\n\n OUTCOME: Sales teams close deals faster\n\n ├─ OPPORTUNITY: Eliminate manual status checking\n │ ├─ Solution shipped: \"Get document status\"\n │ ├─ VALIDATED: Users grant read access (3/3)\n │ ├─ VALIDATED: Standalone value exists (2/3 daily usage)\n │ └─ NEW ASSUMPTION: CRM integration required for universal adoption\n │\n ├─ OPPORTUNITY: Proactive follow-up on stalled contracts\n │ ├─ VALIDATED: Users build automated workflows (2 workflows live)\n │ └─ VALIDATED: Audit logs build trust (cited in feedback)\n\n\nThe pattern was clear: standalone capability had value for some users (Sarah, Alex). But broader adoption required integration depth, not feature breadth.\n\nDuring her next call with Sarah, Maya tested a new question: \"If I could add one thing to this capability, what would have the most impact?\"\n\n\"Honestly? Let me configure the stall threshold. Right now I trigger reminders at 3 days, but for enterprise contracts I'd want 7 days. For NDAs, maybe 24 hours.\"\n\nMaya hadn't considered configurable thresholds. But Sarah had—because she was using the tool daily.\n\nAnother assumption to test: users need customization, not just automation.\n\n* * *\n\n## Week 5: The Presentation\n\nMaya's presentation to the executive team was five slides. Most strategy decks were 40.\n\n**Slide 1: What We Shipped**\nOne capability: \"Get document status\" via MCP server. Read-only access. Three early-access customers.\n\n**Slide 2: Time Investment**\n30 days total. Five days of engineering. One PM (me). One engineer (Jordan).\n\n**Slide 3: Customer Response**\n\n * 2 of 3 customers use it daily\n * Average time saved: 30 minutes per day\n * 3 automated workflows built by customers\n * Permission grant rate: 100% (all three granted access immediately)\n * Feature request signal: 2 customers asked for expanded capabilities\n\n\n\n**Slide 4: What We Learned**\n\n 1. Customer demand is real (not theoretical)\n 2. Integration depth matters more than feature breadth\n 3. Audit logs drive trust (customers cite them in feedback)\n 4. Our API is ready (wrapper needed, not entire rebuild)\n\n\n\n**Slide 5: What's Next**\nRequest: 1 engineer for Q3. Build 3 more read-only capabilities. Add CRM integration. Measure adoption. Report back in September.\n\nThe CEO leaned back. \"Only two users?\"\n\n\"Two who use it every day,\" Maya said. \"That's better than a hundred who tried it once. And they're both asking for more capabilities. We started with permission to read. If we execute well, we earn permission to write.\"\n\n\"What would it take to expand?\"\n\n\"One engineer. Three more capabilities. Revisit in Q3 with real usage data instead of projections.\"\n\nThe CTO nodded. \"I like that you kept it small. Most AI initiatives die in the architecture phase because teams try to build Skynet on day one. This is a real test with real constraints.\"\n\nApproved.\n\n* * *\n\n## Six Months Later\n\nMaya's calendar showed five customer calls scheduled for this week. Not new prospects—existing users. She'd kept the habit from Week 1: regular conversations to understand how they were using the capabilities, what was working, what wasn't.\n\nHer Slack lit up with a message from Sarah.\n\n\"Hey—we've rolled this out to the whole sales team. 15 people using it daily now. Question: can we get write access? I want agents to send reminder emails directly when contracts stall, not just flag them in Slack.\"\n\nMaya smiled. They'd started with an assumption: users will grant read access. Six months later, customers were asking for write permissions. The assumption had evolved: trust builds incrementally.\n\nHer opportunity tree had grown too:\n\n\n OUTCOME: Sales teams close deals faster\n\n ├─ Eliminate manual status checking [SHIPPED]\n │ ├─ Get document status (47 daily users)\n │ └─ CRM integration [IN PROGRESS - Q3]\n │\n ├─ Proactive follow-up on stalled contracts [SHIPPED]\n │ ├─ Configurable thresholds (Sarah's request)\n │ └─ Auto-send reminders [TESTING - needs write access]\n │\n └─ Understand why contracts stall [EXPLORING]\n └─ New assumption: Email thread analysis shows blockers\n\n\nThe metrics told the story:\n\n * 47 customers using agent capabilities daily\n * 8 capabilities live (started with 1)\n * 12 customers upgraded to higher tiers specifically for API access\n * Zero security incidents\n * Audit log access had become a feature customers requested in demos\n\n\n\nThe executive team had asked for a strategy six months ago. Maya had given them something better: Evidence.\n\nShe opened her interview notes from this week. Three customers mentioned the same new pain point: they wanted agents to _create_ contracts from templates, not just check status on existing ones.\n\nMaya added it to the tree. New opportunity. New assumptions to test. Same pattern: talk to customers, identify outcomes, test assumptions, ship small.\n\nShe pulled up her doc from six months ago and added a line at the bottom:\n\n> _\"You don't need a perfect strategy. You need continuous discovery and real usage data.\"_\n\n* * *\n\n## What Made This Work\n\n**Customer conversations weren't one-time research.** Maya talked to customers before building (Week 1), during testing (Week 3), and after shipping (Week 4). Each conversation refined her understanding of what mattered. The customers weren't test subjects—they were collaborators.\n\n**Outcomes drove decisions, not features.** Maya started with \"Sales teams close deals faster\" and worked backward. Every capability mapped to that outcome. When customers requested features (export to CSV, configurable thresholds), she tested whether they advanced the outcome before building.\n\n**Assumptions made learning explicit.** \"Users will grant read access\" was an assumption, not a fact. By naming it, Maya forced _metacognition_ onto the process—she was thinking about her own thinking. When data invalidated an assumption (export to CSV), she could pivot without defending a feature nobody wanted.\n\n**The constraints forced creativity.** Jordan's initial estimate was two to three weeks. By constraining scope—three customers, one month, read-only access—the timeline dropped to five days. Not because they cut corners, but because they cut complexity.\n\n**Trust architecture wasn't optional.** Audit logs, rate limiting, and scope validation took extra time. But customers cited the audit trail as a reason they trusted the capability. Trust wasn't a feature to bolt on later—it was the foundational system requirement.\n\n**Metrics measured outcomes, not activity.** Total API calls: 216 over four weeks. Meaningless. Daily active users: 2 of 3. Time saved: 30 minutes per day. Workflows automated: 3. Those numbers showed value.\n\n**Partial success taught more than total success.** Two customers adopted daily. One tried once and stopped. The difference: integration depth. Standalone capabilities have limits. Customers need tools that fit into existing workflows, not new workflows to adopt.\n\n**Continuous discovery scaled with the product.** Six months later, Maya still scheduled weekly customer calls. As the capability grew (1 feature → 8 features), her opportunity tree grew with it. New customer pain points became new branches to explore.\n\nThe pattern held: talk to customers, identify outcomes, test assumptions, ship small, measure what matters, repeat.\n\nNo perfect strategies. Just continuous learning.\n\n* * *\n\n## Key Frameworks Introduced\n\n**The 30-Day Capability Sprint**\n\n * Week 1: Validate demand with customer interviews\n * Week 2: Technical scoping with constraint-driven design\n * Week 3: Build with trust architecture from day one\n * Week 4: Measure outcomes and gather feedback\n\n\n\n**Four Criteria for First Capability**\n\n 1. Read-only (low risk)\n 2. High value (solves real pain)\n 3. Self-contained (works independently)\n 4. API-ready (or close to it)\n\n\n\n**User-Centric Validation Questions**\n\n * Not: \"Would you use this?\"\n * Instead: \"What would you do with this information?\"\n * Follow-up: \"Walk me through your current process\"\n * Quantify: \"How much time does that take?\"\n\n\n\n**Success Metrics That Matter**\n\n * Daily active users (not total API calls)\n * Time saved per user (quantified outcome)\n * Permission grant rate (trust signal)\n * Feature requests (expansion signal)\n * Workflows automated (integration depth)\n\n\n\n**Trust Architecture Checklist**\n\n * Audit logs (who, what, when, why)\n * Rate limiting (prevent runaway behavior)\n * Scope boundaries (user permissions only)\n * Graceful degradation (cached fallback)\n * Explicit permission grants\n\n",
"title": "Shipping Your First AI Agent in 30 Days: A Guide for Product Managers",
"updatedAt": "2026-05-17T22:48:07.828Z"
}