External Publication
Visit Post

TressaGPT - My new from scratch model - Need feedbacks

Hugging Face Forums [Unofficial] March 23, 2026
Source

Hi All,

I was fortunately able to write a GPT based model from scratch and it is below

huggingface.co

abhijeetmishra101/tressa_gpt_50M · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

I am really happy that it is able to produce coherent sentences but it is not able to write full fledged logical sentences and is hallucinating sometimes. I think I have to write and train a bigger model.

Currently the responses are not so logical and I am planning to pump up the parameters and number of transformer blocks(currently only 6), GPT2 used 12, and even the embedding size is less, just 384. Need to pump it up to alteast around 900 I am feeling. Please tell me if it will work! Currently I feel it will work!

I am also looking to get into a job which helps me grow in this field. I have used runpod to train the model under 20$ trust me. I also finetuned it on a question answer data set.

Feel free to contact me and reply on this thread!

Thanks

Discussion in the ATmosphere

Loading comments...