External Publication
Visit Post

LLM "curving" via prompting

Hugging Face Forums [Unofficial] June 27, 2026
Source

Yeah. That direction is probably the right one:


I think the useful part of your reframing is exactly this: keeping the current claim behavioral for now , while making the stronger mechanistic claims testable rather than asserted.

I cannot promise that I can help at a high technical level. I do not have much compute available, and I do not want to overstate my role. At most, I may be able to help with the documentation / clarification side: making a table of what each metric appears to measure, what its input source seems to be, and what would still be needed for someone else to reproduce it.

I also do not think the next step has to be:

prove or disprove the whole field interpretation

A more useful next step might be:

make the measurement layer legible enough that someone else can reproduce, challenge, or extend it without first accepting the interpretation.

The encouraging part is that several of the figures appear to be derived from hidden-state tensors, not only generated text. So I would not dismiss them as purely rhetorical visualizations. But I would still separate two things:

Layer Example
Neutral formula / measurement layer-to-layer hidden-state variation, deep-layer norm statistic, PCA of layer trajectories
Interpretive label residual jittering, ontological grip, attractor hold, gravity well, braiding

Both can coexist. The interpretive names may be useful for intuition, but a technical collaborator will probably need the neutral measurement contract first.

A short version of that contract could look like this:

Current label Neutral measurement name Likely source What a collaborator would need
Residual Jittering / Chaos Force layer-to-layer hidden-state variation hidden states formula, normalization, controls, raw series
Attractor Hold / Ontological Grip normalized deep-layer norm statistic hidden states layer range, formula, controls
Balance of Power overlay of two separately scaled hidden-state summaries hidden states raw values, baseline/style-control
Braided Invariants PCA view of token/layer hidden trajectories hidden states projection params, seed, controls
Manifold Resonance mid-vs-final layer cosine similarity hidden states exact layer indices, controls
Geometric Density / Gravity Well Depth SVD/spectral concentration statistic hidden states raw spectral values, directionality, controls
Specificity Flux final-layer vector dispersion over steps hidden states raw time series, controls
Probabilistic Drift / Logit Entropy LM-head projection metrics hidden states + LM head exact layers, logits/probs, controls

So if someone with mechanistic-interpretability experience joins later, the first task does not need to be “evaluate EPE as a theory.” It can be something much smaller:

reproduce these metrics on one open model, with a baseline prompt, an EPE/curving prompt, and a style-control prompt.

That is probably much easier to collaborate on.

Longer measurement-contract sketch (click for more details) Minimal collaboration target (click for more details) Implementation details that should probably be recorded (click for more details) Possible technical extension paths (click for more details) Visualization caveat (click for more details)

So my current practical suggestion would be:

  1. keep the main public claim behavioral for now;
  2. preserve the field interpretation as a hypothesis or intuition layer;
  3. make the existing hidden-state-derived metrics legible in neutral terms;
  4. add baseline/style controls;
  5. publish raw metric tables and plotting code;
  6. let a future technical collaborator extend it toward representation comparison, activation steering, or patching.

That seems like the smallest useful bridge between the current work and the kind of mechanistic test you are looking for.

Discussion in the ATmosphere

Loading comments...