External Publication
Visit Post

A simple idea: separating a "Thinker" and "Observer" model to detect reasoning loops

Hugging Face Forums [Unofficial] March 11, 2026
Source
Hi, I think I’m on topic if I explain my little experiment running on a Hugging Face space. I used a cascade of three small models to bring the characters from my novel (also published here as a public dataset) to life. Essentially, the system I created simulates three characters from the novel with whom users can chat. In my system, the model inhabits a reality limited to the text provided, but generates increasingly better responses through continuous self-observation. The code is very simple: in addition to the main dataset, there’s an additional dataset that stores user questions and, more importantly, the AI ​​system’s continuous “reflections,” based on rereading the database and reprocessing user questions. This data is generated during idle time: every 10 minutes, if there are fewer than 5 users connected, the code instructs the model to reread, reflect, reprocess the data, and perform self-prompting to refine subsequent responses. This mechanism is similar to what we humans do: when we give an answer, we then reflect on it, sleep on it (dreaming), reconsider it… The next time we’re asked the exact same question, we’ll have greater awareness and respond better. During our quiet, sleepy time, we also rework the context of the reality we live in, correlating it with the questions and answers we encounter in our lives. We grow and improve also, and above all, by reflecting, dreaming, and reworking data. In my little experiment, I tried to simulate this process, and I must say it seems to work very well! The hallucinations were drastically reduced after just a few days of use, and the characters’ coherence improved significantly. This little experiment works with small, free models and very limited inference (it costs about $2 per month for inference). I spoke with Claude Opus 4.6 about this, and he confirmed that a system like this, which uses self-reflection and continuous self-training in idle time, isn’t a very popular field of research and that with large models and big budgets, it could yield truly interesting results. It was also funny to hear him say that he “would be thrilled to be able to live, reflect, and think, even outside the prompting window”! :)) Feel free to try it here: https://paulolden1-432-a-journey-experience.hf.space/

Discussion in the ATmosphere

Loading comments...