Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifq4tk2eidi2zvlnsbpcu37fkttz2fipsliy4w4e66el3xor43q24",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3molyttrskqq2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreidlzshwt45gw5fent7c5gjgomiyx56op57uf6lcrhdlurczlv27py"
    },
    "mimeType": "image/webp",
    "size": 103234
  },
  "path": "/susumun/renaming-a-site-without-losing-its-data-separating-display-name-from-a-stable-identifier-gpb",
  "publishedAt": "2026-06-18T23:08:47.000Z",
  "site": "https://dev.to",
  "tags": [
    "wordpress",
    "php",
    "tutorial"
  ],
  "textContent": "A client asks you to rename a site from `acme-staging` to the production name `acme`. The moment you rename it in the app, **the DB backups, screenshots, and thumbnails you had been collecting all appear to disappear**.\n\nThe files are still on disk, but the new directory is empty. **The data hasn't carried over as \"the same site.\"** It's a trap you can fall into on day one, and we did — with our original design.\n\nHere's how we redesigned things so renames don't orphan data.\n\n##  Why the data appears to disappear — the site name was the key\n\nThe original design of WP Maintenance Manager decided file locations **based on the site name**.\n\n\n\n    backups/\n      acme-staging/        ← DB backups for site \"acme-staging\"\n        backup_20260101_120000.sql\n\n    screenshots/\n      acme-staging/        ← Screenshots for the same site\n        home_pre.png\n\n\nAfter renaming `acme-staging` → `acme`, **a new empty directory`backups/acme/` gets created and starts from zero**. The old directory is still there, but the app treats it as \"some other site's stale data\" and doesn't surface it.\n\nSite names are natural candidates for labels, but **in practice they get renamed all the time**. Cleaning up client-name typos, promoting staging to production, renaming on a re-org — the reasons to rename are endless.\n\n##  The fix — give every site an immutable `site_id`\n\nEvery site now carries a `_id` in the form **`site_xxxxxxxxxxxx`** (a UUID, 12 hex chars), and every file location now keys off that `_id` instead of the site name.\n\n\n\n    # core/site_id_utils.py\n    def generate_site_id():\n        return f\"site_{uuid.uuid4().hex[:12]}\"\n\n\n`_id` is **assigned once and never changes**. Even if the site name is renamed, the file location stays the same `backups/site_a1b2c3d4e5f6/` directory — and the existing contents are still in use.\n\nIt's a classic two-layer design: the display name (site name) is separate from the internal identifier (`_id`).\n\n##  A migration that doesn't break existing data\n\nThe hardest part was handling **existing users whose data was already keyed by site name**.\n\n`ensure_site_ids()` is an idempotent migration:\n\n  * Auto-generates and assigns `_id` only to sites that don't have one\n  * Leaves sites that already have a `_id` untouched\n  * Uses `FileLock` + tempfile + `os.replace()` for atomic writes, so a crash mid-write won't corrupt anything\n\n\n\nIt runs at app startup and at the entry points of site-related APIs (three paths in total). **The user doesn't have to do anything** — IDs are silently assigned in the background.\n\nThe file-side migration follows the same pattern. On first launch, if `backups/<site_name>/` exists, rename it to `backups/<site_id>/` (but if the new-format directory already exists, leave both alone). Idempotent.\n\n##  Tying logs to sites — strict + compat hybrid matching\n\nLog entries also carry a `site_id` now. But **existing log entries don't have one** — they were written before the rename.\n\nThe UI scoping feature (filter logs for a specific site) is implemented as a hybrid:\n\n  * New logs (with `site_id`) → match by **strict equality**\n  * Old logs (without `site_id`) → fall back to **`site_name` compat matching**\n\n\n\nThe result: logs from before and after a rename appear together in the same scope. The user never feels like \"past history disappeared.\"\n\n##  A post-release blunder\n\nFor honesty: shortly after release, we shipped a bug. The `is_valid_site_id` validation function had a regex that **only matched the new-generation format** , and rejected some legitimate existing IDs.\n\n\n\n    # the broken version\n    SITE_ID_RE = re.compile(r'^site_[0-9a-f]{12}$')  # exactly 12 hex\n\n\nA few longer ID formats — leftovers from the migration's earlier iterations — got rejected outright, and the symptom was \"every site has disappeared.\" The lesson is mundane but real: **fully audit existing data formats before tightening validation**. Adding validation after the fact is exactly where these regressions hide.\n\n##  Takeaway — separating stable identifier from display name\n\nSeparating \"the name displayed to humans\" from \"the immutable identifier\" is a classic software-design pattern, but **introducing it after the product is already in production is expensive**. The idempotent migration, the edit-vs-duplicate ownership split, the backward-compatible validation — drop any one of these and existing user data evaporates.\n\nSince separating site name (display) from `site_id` (immutable), clients can have their site names corrected, staging promoted to production, or org-rename refactoring done — all while **keeping every byte of historical data tied to the same site**. Designing your file locations to trust the display name 100% on day one closes that door before you even reach for it. That's the retrospective on this one.",
  "title": "Renaming a site without losing its data — separating display name from a stable identifier"
}