{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreic5juzmlpezmebcxccb3d5qey2kpjdyi245v55qzwo73w4sjoa4za",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mgrb7sx3av32"
  },
  "path": "/t/about-traning-lora-for-z-image-turbo/173911?page=2#post_23",
  "publishedAt": "2026-03-11T05:21:03.000Z",
  "site": "https://discuss.huggingface.co",
  "textContent": "Yes, the problem was entirely with the captions. Also, I think the CDO becomes 0 when cache text embeddings are enabled. I disabled that and updated the captions from normal fluent English to the exact format I specified. 3k and 4k is the direct sweet spot. I’ll test them now. Since I’ve presented a wide variety of exposures in 45 images, I believe I can even get many challenging exposures. I’ll try and see.\n\nBy the way, I’m not sure if stepping affects learning speed, but I’ve clearly experienced this before. For example, right now at 4.5k stepping, I’m starting to get sweet spots at 3k. If I reduce it to 3.5k stepping, neutral prompts probably won’t work at all. Because I think stepping and LR are a multiplier (I guess I should focus more on the underlying math).\n\nAnd yes, now I’ll also train on the base with the exact same parameters. The previous training I did worked on ZIT.\n\nYes, I’m currently in the character LoRa phase. I’ve created extremely consistent visuals using Face LoRa.",
  "title": "About traning LoRa for Z Image Turbo"
}