Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieiwoqnldjowxnhklqqca2ta56ecn7njpnfsdsaijlex7rxhcfsqa",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mmezhur3d5u2"
  },
  "path": "/t/how-can-a-small-language-model-learn-open-conversation/1381396#post_3",
  "publishedAt": "2026-05-21T17:35:35.000Z",
  "site": "https://community.openai.com",
  "textContent": "So I started with ChatGPT 2 base model from GitHub, which, despite the claims, was an empty shell. I want to grow the model into a large model overall. The base model was the source for the body upgrade:\n“source_layers”: 12, (old body)\n“target_layers”: 24, (new body)\n“source_embd”: 768,\n“target_embd”: 1024,\n“source_heads”: 12,\n“target_heads”: 16,\n“source_ctx”: 1024,\n“target_ctx”: 1024,\nwith a total vocab of ~120k tokens over all. I have very strongly grounded all the new and old tokens in definitions and examples. Yet, as my original post stated, I cannot seem to cross that line.\n\nThe goal is to make the model have the same cumulative communicational skill as ChatGPT 3+, yet I cannot do that if the model cannot combine all the education into a communication matrix.",
  "title": "How can a small language model learn open conversation?"
}