{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiesx4knx7wvmqulyz6ktejny3urkuqc3777cmqbfrxckfl4emkq3m",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mh53hlsplcb2"
},
"path": "/t/a-persistent-8-15-failure-rate-across-domains-evidence-for-an-epistemic-boundary/174264#post_2",
"publishedAt": "2026-03-15T11:37:53.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"Been building a stateful PHI de-identification system for streaming multimodal data. Here's what I learned",
"Research"
],
"textContent": "Been building a stateful PHI de-identification system for streaming multimodal data. Here's what I learned Research\n\n> A name in a clinical note. The same name in an ASR transcript ten minutes later. A matching date in a waveform header. Three records, each one harmless on its own. Together they’re enough to re-identify a patient, and most masking systems never see it coming because they process each record in isolation. That’s the problem I’ve been working on. The system tracks cumulative PHI exposure per patient across modalities and time, maintains a risk score, and escalates masking policy automatically as …",
"title": "A Persistent 8–15% Failure Rate Across Domains: Evidence for an Epistemic Boundary"
}