{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreihyr3duv3wslvfxlltozc2xpt7y4pbb2p5jekcz2aivm7z2ra2xbi",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mnccgbadugn2"
},
"path": "/t/fine-tuning-an-slm-for-a-low-resource-language/176467#post_1",
"publishedAt": "2026-06-02T08:30:25.000Z",
"site": "https://discuss.huggingface.co",
"textContent": "Hello, I am fine-tuning an SLM for an AI festival. I am aiming to make the model stronger in a specific language. Unfortunately, there are not many fine-tuning-ready datasets for the language I’m aiming for, and because of my hardware limitations and internet restrictions, I cannot continue pretraining the model. I wanted to ask two things: Can I use LoRA to simulate continued pretraining? And how can I build a QA dataset from raw Wikipedia dumps?",
"title": "Fine-Tuning an SLM for a Low-Resource Language"
}