Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiaf27nca5yxkuraxuelaifdwd5tbutqroplmoxr6naujeajjgeeyy",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mnemap6tebt2"
  },
  "path": "/t/secbert-to-detect-anomalous-log-entries/176237#post_3",
  "publishedAt": "2026-06-03T05:45:20.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Thanks for your detailed e-mail. It helped.\n\nOnce I ensured the tokenizer configuration was saved natively (not being done in my earlier code) with the exact vocabulary rules and casing preservation parameters matching the training checkpoint, the tokenization alignment was perfectly restored. Now the textual risk assessment scores show up correctly, which were showing up earlier as anomalous for all the log entries.\n\nregards,\n\nVIjay",
  "title": "Secbert to detect anomalous log entries"
}