{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreih6v2ir4hxf4dvex2qsf34w2kydkjjrvnippf2dv3cbhwxdlh2uay",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mhpkjqnpbge2"
},
"path": "/t/tressagpt-my-new-from-scratch-model-need-feedbacks/174539#post_1",
"publishedAt": "2026-03-23T04:50:41.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"huggingface.co",
"abhijeetmishra101/tressa_gpt_50M · Hugging Face"
],
"textContent": "Hi All,\n\nI was fortunately able to write a GPT based model from scratch and it is below\n\nhuggingface.co\n\n### abhijeetmishra101/tressa_gpt_50M · Hugging Face\n\nWe’re on a journey to advance and democratize artificial intelligence through open source and open science.\n\nI am really happy that it is able to produce coherent sentences but it is not able to write full fledged logical sentences and is hallucinating sometimes. I think I have to write and train a bigger model.\n\nCurrently the responses are not so logical and I am planning to pump up the parameters and number of transformer blocks(currently only 6), GPT2 used 12, and even the embedding size is less, just 384. Need to pump it up to alteast around 900 I am feeling. Please tell me if it will work! Currently I feel it will work!\n\nI am also looking to get into a job which helps me grow in this field. I have used runpod to train the model under 20$ trust me. I also finetuned it on a question answer data set.\n\nFeel free to contact me and reply on this thread!\n\nThanks",
"title": "TressaGPT - My new from scratch model - Need feedbacks"
}