Raw Record Source

{
  "$type": "site.standard.document",
  "content": {
    "$type": "site.standard.content.markdown",
    "text": "## 2025 Edition\n\nLast year, I shared [some LLM workflows](/useful-llm-tools-2024) I was finding useful. Since then my workflows have changed and I now build many things I would not have attempted before. Here are some agentic workflows I use for work, data, productivity, and research.\n\n## Magical UNIX Pipelines\n\nI mentioned [`llm`](https://llm.datasette.io/) in last [year's post](/useful-llm-tools-2024). My usage has only increased! For any small to medium task that involves text, I use [`llm`](https://llm.datasette.io/) with a custom template. It is amazingly simple and effective.\n\nI've used it to summarize and interrogate YouTube videos, tag documents, extract structured data from many files, generate additional columns for datasets, caption images, and many more little things.\n\nThe most \"unique\" way I'm using it might be what I call **magic brushes**. I've assigned keyboard shortcuts to the most common `llm` templated tasks I rely upon. I have a bunch of scripts that do something like: (1) take whatever is selected, (2) run `llm` with a fixed template (e.g. fix typos, clean format, ...), and (3) replace the selection, create a new terminal, or add to the clipboard the processed output. These are really \"[magic text brushes](https://github.com/davidgasquez/dotfiles/blob/1bf35de417b122161828260cfda6f397b9311980/scripts/magic-brush)\" that I can apply anywhere text can be selected!\n\n## Help Agents Help You\n\n[I've written about this before](https://davidgasquez.com/llm-friendly-projects/#helping-llms-help-you) and it is still something I'm iterating. The goal is to make agents productive in whatever they're doing.\n\n### Files as State\n\nFor any non-obvious task, I keep all kinds of things as plain files, and those files in Git. Prompts, schemas, experiment logs, results, all live next to the code and not in a chat sidebar. This helps a lot with keeping track of what I've tried, and it also helps agents plan their next steps.\n\nI usually have a lean `AGENTS.md`, a `CONTEXT.md` and perhaps a few `PLAN_` files. I rely a lot on [progressive disclosure of context](https://www.humanlayer.dev/blog/writing-a-good-claude-md) so not everything is dumped at once.\n\n### Custom Benchmarks\n\nWhen starting with a project, I define something to act as the \"goodness\" metric. Some tests, a ML metric like AUC, or anything that can be used to verify. Then, I just prompt the agent to get something working and repeat. Even going as far as starting from scratch. The goal is to **exploit the verify/generate asymmetry that happens in some tasks**.\n\nLooks similar to a \"manual evolutionary algorithm\" where I'm generating many project candidates, evaluating them, keeping the things I like, and repeating the process. Which brings me to the next point...\n\n### Asynchronous Disposable Projects\n\nI'll often run multiple attempts (sometimes in parallel), and learn what I like and what I keep while watching the agents figure out the task. I'll take the learnings and make them part of the next prompt or context.\n\nFor things I want only done once, like writing a quick scraping script, exploring random datasets, generating one-off reports, etc, I've set up an [empty `labs`](https://github.com/davidgasquez/labs) repository. I fire a few asynchronous Codex/Claude agents on the repository, forget about them for a while, and then review the results and continue elsewhere.\n\nI try perhaps 2 or 3 different ideas per day on that repository. Surprisingly, most of them work! Having a sandboxed repository with an LLM in a loop with internet access is very powerful. Makes learning about that weird API/dataset very fast.\n\n### Prototyping\n\nThe approach I use makes me end up rebuilding the same thing over and over as I figure out what works (and I like) and what doesn't. This breadth-first prototyping is amazing for exploration and learning. I used to spend tons of time researching, reading docs, trying out frameworks, ... now I just have the agent try all of them and see what sparks joy in me.\n\nAnother big takeaway is that there is lots of value in exploring different frameworks and common ideas/patterns for a certain task, and then weaving something together that works for you from scratch. A recent personal example has been replacing some [evidence.dev](https://evidence.dev/) and [Observable Frameworks](https://observablehq.com/framework/) dashboards with a static Astro site. I took away a couple of \"patterns\" I liked and had the agent build something from scratch that worked for me better than the frameworks could.\n\n## Curated Style Guides\n\nA powerful side effect of [maintaining a personal knowledge base](/building-a-pkb) is that I've been passively curating style guides and useful resources for different kinds of tasks for the last few years. I have notes on [good writing](/handbook/writing/), [making checklists](/handbook/checklist/), crafting [good commit messages](/handbook/git/), and many more [random things](/handbook).\n\nThat means that when I need the model to produce a specific kind of artifact, I already have a bunch of links to give that I've previously verified and enjoyed.\n\nA recent agent pointed me to `clig.dev` as a reference for a small CLI I had to build and produced a lovely CLI in one shot!\n\nI'm now thinking about packaging the handbook [as a skill](https://agentskills.io/home) so it can discover and use my handbook notes.\n\n## Agentic Data Engineering\n\nFolks always say you spend **80% of the time cleaning data** when working with any dataset. Well, now it might be different! I've used some agentic engineering techniques to automate cleanup scripts, letting me focus more on charting, design, and deeper analysis. I've shared before how I do [scrappy data cleaning](/scrappy-data-cleaning) with Claude Code inline mode and how I'm [specializing Codex](/specializing-codex) for different data tasks. Overall, you can use these techniques for a few different data tasks.\n\n### Data Enrichment\n\nYou can run an LLM against every row of a dataset to derive a new column (labels, normalization, location resolution, ...). This is something `llm` with a cheap/local model makes viable.\n\nYou can also convert messy, unstructured documents (like many `text` fields out there) into structured data (JSON or CSV) or derive useful categories or labels from them. Again, `llm` or Codex work well for these kinds of tasks.\n\n### Data Cleaning\n\nMake your agent write cleanup scripts while you focus on verifying the result and writing data tests.\n\n## Agentic Research\n\nFor research tasks, I rely on Markdown files as much as I can. Basically, I use the same approach as when working on other projects.\n\nI write a small document to act as the prompt and context, download related sources as Markdown files, and do a few initial runs to learn what I want and how to guide the agent better.\n\nInside that prompt, there is usually a list of open questions. This helps the LLM fill out the gaps you have.\n\n## Conclusion\n\nFor the projects I'm working on, I think I spend more time tweaking the agent harnesses (skills, commands, prompts, tests, verification etc.) than working on the projects themselves. Working on harnesses clearly helps me understand the problem better, but I'm not sure if I'm ultimately more efficient.\n\nWhat I'm sure is that I'm having more fun. I like working on this better than another ETL, dashboard, or data request. It's similar to playing Satisfactory or Factorio. I'm not literally programming, but it tickles the same part of the brain and has a different experience (visually in gaming, conversation-based in LLMs).\n\nI still have the _agency_ to bring my taste, context, and knowledge into the process, while letting the agents take care of the parts I _mostly_ know how to do. Removing this friction has made me more curious and \"playful\" when approaching problems.\n\nI've always enjoyed tinkering. I used to spend a lot of time tweaking my dotfiles. Now, I spend time exploring agentic workflows that iterate on the dotfiles. Using coding agents has been [my hammer](https://en.wikipedia.org/wiki/Law_of_the_instrument) this last year and I've had a ton of fun sharpening it.\n\nIf you're interested in these kinds of things, please follow [Simon Willison](https://simonwillison.net/), [Armin Ronacher](https://lucumr.pocoo.org/), [Peter Steinberger](https://steipete.me/), and [Mario Zechner](https://mariozechner.at/).",
    "version": "1.0"
  },
  "description": "2025 Edition Last year, I shared some LLM workflows I was finding useful. Since then my workflows have changed and I now build many things I would not have attempted before. Here are some agentic workflows I use for work, data, productivity, and research. Magical UNIX Pipeline...",
  "path": "/useful-agentic-workflows-2025",
  "publishedAt": "2025-12-13T00:00:00.000Z",
  "site": "at://did:plc:4z5i7njrld66ew36htufcwry/site.standard.publication/3mo43d2tmt2ov",
  "textContent": "2025 Edition\n\nLast year, I shared some LLM workflows I was finding useful. Since then my workflows have changed and I now build many things I would not have attempted before. Here are some agentic workflows I use for work, data, productivity, and research.\n\nMagical UNIX Pipelines\n\nI mentioned llm in last year's post. My usage has only increased! For any small to medium task that involves text, I use llm with a custom template. It is amazingly simple and effective.\n\nI've used it to summarize and interrogate YouTube videos, tag documents, extract structured data from many files, generate additional columns for datasets, caption images, and many more little things.\n\nThe most \"unique\" way I'm using it might be what I call magic brushes. I've assigned keyboard shortcuts to the most common llm templated tasks I rely upon. I have a bunch of scripts that do something like: (1) take whatever is selected, (2) run llm with a fixed template (e.g. fix typos, clean format, ...), and (3) replace the selection, create a new terminal, or add to the clipboard the processed output. These are really \"magic text brushes\" that I can apply anywhere text can be selected!\n\nHelp Agents Help You\n\nI've written about this before and it is still something I'm iterating. The goal is to make agents productive in whatever they're doing.\n\nFiles as State\n\nFor any non-obvious task, I keep all kinds of things as plain files, and those files in Git. Prompts, schemas, experiment logs, results, all live next to the code and not in a chat sidebar. This helps a lot with keeping track of what I've tried, and it also helps agents plan their next steps.\n\nI usually have a lean AGENTS.md, a CONTEXT.md and perhaps a few PLAN files. I rely a lot on progressive disclosure of context so not everything is dumped at once.\n\nCustom Benchmarks\n\nWhen starting with a project, I define something to act as the \"goodness\" metric. Some tests, a ML metric like AUC, or anything that can be used to verify. Then, I just prompt the agent to get something working and repeat. Even going as far as starting from scratch. The goal is to exploit the verify/generate asymmetry that happens in some tasks.\n\nLooks similar to a \"manual evolutionary algorithm\" where I'm generating many project candidates, evaluating them, keeping the things I like, and repeating the process. Which brings me to the next point...\n\nAsynchronous Disposable Projects\n\nI'll often run multiple attempts (sometimes in parallel), and learn what I like and what I keep while watching the agents figure out the task. I'll take the learnings and make them part of the next prompt or context.\n\nFor things I want only done once, like writing a quick scraping script, exploring random datasets, generating one-off reports, etc, I've set up an empty labs repository. I fire a few asynchronous Codex/Claude agents on the repository, forget about them for a while, and then review the results and continue elsewhere.\n\nI try perhaps 2 or 3 different ideas per day on that repository. Surprisingly, most of them work! Having a sandboxed repository with an LLM in a loop with internet access is very powerful. Makes learning about that weird API/dataset very fast.\n\nPrototyping\n\nThe approach I use makes me end up rebuilding the same thing over and over as I figure out what works (and I like) and what doesn't. This breadth-first prototyping is amazing for exploration and learning. I used to spend tons of time researching, reading docs, trying out frameworks, ... now I just have the agent try all of them and see what sparks joy in me.\n\nAnother big takeaway is that there is lots of value in exploring different frameworks and common ideas/patterns for a certain task, and then weaving something together that works for you from scratch. A recent personal example has been replacing some evidence.dev and Observable Frameworks dashboards with a static Astro site. I took away a couple of \"patterns\" I liked and had the agent build something from scratch that worked for me better than the frameworks could.\n\nCurated Style Guides\n\nA powerful side effect of maintaining a personal knowledge base is that I've been passively curating style guides and useful resources for different kinds of tasks for the last few years. I have notes on good writing, making checklists, crafting good commit messages, and many more random things.\n\nThat means that when I need the model to produce a specific kind of artifact, I already have a bunch of links to give that I've previously verified and enjoyed.\n\nA recent agent pointed me to clig.dev as a reference for a small CLI I had to build and produced a lovely CLI in one shot!\n\nI'm now thinking about packaging the handbook as a skill so it can discover and use my handbook notes.\n\nAgentic Data Engineering\n\nFolks always say you spend 80% of the time cleaning data when working with any dataset. Well, now it might be different! I've used some agentic engineering techniques to automate cleanup scripts, letting me focus more on charting, design, and deeper analysis. I've shared before how I do scrappy data cleaning with Claude Code inline mode and how I'm specializing Codex for different data tasks. Overall, you can use these techniques for a few different data tasks.\n\nData Enrichment\n\nYou can run an LLM against every row of a dataset to derive a new column (labels, normalization, location resolution, ...). This is something llm with a cheap/local model makes viable.\n\nYou can also convert messy, unstructured documents (like many text fields out there) into structured data (JSON or CSV) or derive useful categories or labels from them. Again, llm or Codex work well for these kinds of tasks.\n\nData Cleaning\n\nMake your agent write cleanup scripts while you focus on verifying the result and writing data tests.\n\nAgentic Research\n\nFor research tasks, I rely on Markdown files as much as I can. Basically, I use the same approach as when working on other projects.\n\nI write a small document to act as the prompt and context, download related sources as Markdown files, and do a few initial runs to learn what I want and how to guide the agent better.\n\nInside that prompt, there is usually a list of open questions. This helps the LLM fill out the gaps you have.\n\nConclusion\n\nFor the projects I'm working on, I think I spend more time tweaking the agent harnesses (skills, commands, prompts, tests, verification etc.) than working on the projects themselves. Working on harnesses clearly helps me understand the problem better, but I'm not sure if I'm ultimately more efficient.\n\nWhat I'm sure is that I'm having more fun. I like working on this better than another ETL, dashboard, or data request. It's similar to playing Satisfactory or Factorio. I'm not literally programming, but it tickles the same part of the brain and has a different experience (visually in gaming, conversation-based in LLMs).\n\nI still have the agency to bring my taste, context, and knowledge into the process, while letting the agents take care of the parts I mostly_ know how to do. Removing this friction has made me more curious and \"playful\" when approaching problems.\n\nI've always enjoyed tinkering. I used to spend a lot of time tweaking my dotfiles. Now, I spend time exploring agentic workflows that iterate on the dotfiles. Using coding agents has been my hammer this last year and I've had a ton of fun sharpening it.\n\nIf you're interested in these kinds of things, please follow Simon Willison, Armin Ronacher, Peter Steinberger, and Mario Zechner.",
  "title": "Useful Agentic Workflows"
}