Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifl64vmkvbafmbz2f6c4a6moeb3ar3qbiywnep6y6u7ok6zlklhhe",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mopyd5lox2g2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreiat57oad66fdcnlovjqilcy5tutod7h5vc6ccngipe4hm4c3y27lu"
    },
    "mimeType": "image/webp",
    "size": 71252
  },
  "path": "/iamvvekverma/i-built-a-dead-code-forensics-cli-because-this-file-is-unused-is-never-enough-4klo",
  "publishedAt": "2026-06-20T13:06:10.000Z",
  "site": "https://dev.to",
  "tags": [
    "cli",
    "showdev",
    "sideprojects",
    "tooling",
    "https://github.com/iamvvekverma/fossil",
    "@deprecated"
  ],
  "textContent": "Every senior developer has stared at a file and thought: _should I delete this?_\n\nYou run `vulture`. You run `deadcode`. They both tell you the file has zero import\nsites. You hover over the delete key.\n\nAnd then you don't delete it. Because you don't actually know **why** it's there.\n\n##  The real question isn't \"is it dead?\" — it's \"why is it dead?\"\n\nDead code falls into very different categories:\n\n**Category A** — Accidentally orphaned. A file that got left behind when the\ncalling code was removed. Safe to delete immediately.\n\n**Category B** — Intentionally parked. \"keeping this around until Q2 rollout\ncompletes.\" The holdback condition may or may not still apply.\n\n**Category C** — Dynamically loaded. Appears dead to static analysis but gets\n`importlib.import_module()`'d at runtime. Deleting it breaks production.\n\n**Category D** — Replaced but not removed. The function was superseded by a better\nimplementation, the PR was merged, and nobody cleaned up. Safe to delete, but you\nneed to know _what replaced it_ to be confident.\n\nExisting tools treat all four identically: \"unused.\" That's the gap fossil fills.\n\n##  How fossil works\n\n\n    pip install fossil-code\n    fossil explain src/billing/legacy_processor.py\n\n\nfossil runs five stages in under 3 seconds:\n\n###  Stage 1: Static Analysis\n\nPython's `ast` module builds a symbol table of everything the file exports. Then it\nscans every other file in the repo for references — imports, calls, attribute access,\ndynamic patterns (`importlib`, `getattr`, `__import__`). This is not grep; it's an\nactual AST traversal that understands Python's import semantics.\n\n###  Stage 2: Git History Mining\n\n`GitPython` traverses `git log --follow` for the target file. It walks commits\nnewest-to-oldest, checking at each step whether any other file in the repo was still\nreferencing the target. The first commit where all references drop to zero is the\n**death commit**.\n\n###  Stage 3: Commit and PR Parsing\n\nThe death commit message is parsed for PR references (`#441`, `PR 441`,\n`pull request 441`). If found, the PR title and merge context are extracted from the\ncommit body or (if a GitHub token is configured) via the GitHub API.\n\n###  Stage 4: Pattern Detection\n\nThe file's current content is scanned for deferred-deletion patterns:\n\n  * `TODO: remove after X`\n  * `keep for now` / `keep around until X`\n  * `DEPRECATED` / `@deprecated`\n  * `will be removed in version X`\n  * `temporary` / `temp fix`\n\n\n\nFor each pattern, fossil attempts to verify the condition: does a PR with that\ndescription exist and is it merged? Has a git tag for that version been created?\nHas the referenced date passed?\n\n###  Stage 5: Confidence Scoring\n\n14 weighted signals are aggregated into a 0–100% score:\n\nSignal | Weight\n---|---\nZero call sites | +30\nNo dynamic references | +20\nDeath commit identified | +15\nTemporary hold resolved | +10\nNo reflection patterns | +10\nFile age > 90 days dead | +8\nPR/migration context found | +7\nDynamic import detected | −30\nReflection detected | −20\nModified < 30 days ago | −20\nUnresolved \"keep for now\" | −15\nLanguage unknown (fallback) | −15\nTest file references | −10\nAmbiguous death commit | −10\n\nThe output is a Rich panel with every signal explained, a risk label, and a\nsuggested action.\n\n##  Scan an entire codebase\n\n\n    fossil scan ./src --threshold 80\n\n\nReturns a ranked table of all dead files above 80% confidence, with their\ndead-since date and risk level. Exit code 4 means nothing above the threshold —\nuseful for CI gates.\n\n##  Current state and roadmap\n\n**Live now (v0.2.0):**\n\n  * `fossil explain` — full forensic report\n  * `fossil scan` — directory scan with confidence threshold\n  * `fossil clean` — prioritized deletion backlog\n  * Python deep analysis (AST), text fallback for JS/TS/Java/Go\n  * Pattern detection with condition verification\n  * Local SQLite caching (cache hits < 100ms)\n\n\n\n**Coming next:**\n\n  * GitHub/GitLab API integration (PR title/body fetch, `--yolo` deletion PR creation)\n  * LLM narration for human-readable explanations\n  * tree-sitter for deep multi-language analysis\n\n\n\n##  Install\n\n\n    pip install fossil-code  # Python 3.11+, requires git\n\n\nGitHub: https://github.com/iamvvekverma/fossil\n\nThe confidence scoring weights are a first attempt — I'd genuinely like feedback on\nwhether the calibration holds for your codebase. What signal should weigh more? What\nam I missing?",
  "title": "I built a dead code forensics CLI because \"this file is unused\" is never enough"
}