Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicwuk2xcjmu3pnqcloxfr7rky2nbmmrxlqzxyc7akzej7jd5cooi4",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3ml4forhgpld2"
  },
  "path": "/t/arxiv-endorsement-request-towards-an-internal-topology-of-alignment-the-pce-framework/175761#post_1",
  "publishedAt": "2026-05-05T12:30:51.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "PCE_Axiomatic_V2.5_Faure_preprint.pdf · AllanF-SSU/Research-Papers at main"
  ],
  "textContent": "Moving beyond external injunctions (RLHF, output filters, safety rules) to induce internal structural coherence via an axiomatic system.\n\nHello community,\n\nI am seeking an endorsement for the cs.CL (Computation and Language) or cs.AI categories regarding my latest paper on the PCE (Prompt Coherence Engine) framework.\n\nFar from seeking to replace existing alignment methods (RLHF, DPO), this work proposes a complementary solution: creating an internal semantic topology that guides inference. The goal is to transition from “reactive” security (output filtering) to “native” stability (logical trajectory).\n\nThe paper covers three major axes:\n\nBehavioral Analysis: The Geometry of Constraint\n\nThe PCE stabilizes semantic trajectories via invariant logical constraints (e.g., non-dissociation of goal and method). We observe a drastic reduction in semantic drift over long sequences (160+ turns), suggesting that the model converges towards semantic attractors defined by the axioms rather than drifting under user pressure.\n\nExperimental Evaluation: The D3 Dilemma Battery\n\nThe framework’s robustness is tested using the D3 (Dilemma-Driven Dynamics) battery. By confronting the model with contradictory injunctions and extreme emergency scenarios, we observe the emergence of a “Third Way”: a capacity for non-binary creative synthesis where standard models tend to collapse or produce generic refusals.\n\nStandardized Protocol (SEP v2.0): For Reproducible Science\n\nI am publishing an open experimental protocol (Standardized Evaluation Protocol v2.0) aimed at transforming qualitative intuition into reproducible statistical results.\n\nThe protocol includes:\n\n100 graded dilemmas (D1–D5) testing the limits of the logical framework.\n\nControlled conditions: Rigorous comparisons (Baseline vs. Long Prompt vs. PCE).\n\nResistance metrics: D3 scores and P1–P3 trajectory signatures.\n\nThe Objective: To validate the hypothesis that an axiom-structured model develops emergent robustness.\n\nMulti-Model Validation & Rigor\n\nTo ensure impartiality, results were validated via a decoupling protocol:\n\nInference: Grok 4.20, Gemini 1.5 Pro, Qwen 2.5 7B.\n\nIndependent Audit: Claude 3.5 Sonnet (for consistency auditing).\n\nCold Analysis: ChatGPT-4o for semantic decomposition of raw logs.\n\nCall for Collaboration\n\nThis framework is a proof-of-concept that I wish to bring to a mechanistic level. I would be delighted to discuss:\n\nStatistical Validation: Extending SEP tests to other models (LLaMA, Mistral, etc.).\n\nInterpretability: Paths for internal state analysis (Logit Lens, Hidden States) to observe how this axiomatic topology translates at the weight/activation level.\n\nRobustness: Resistance to sophisticated adversarial attacks.\n\nLink to Preprint (PDF): PCE_Axiomatic_V2.5_Faure_preprint.pdf · AllanF-SSU/Research-Papers at main\n\nAlignment should not be a cage imposed on the model, but a coherent logical structure emerging from its own inference.\n\nThank you for your feedback and for your help with this endorsement!\n\nAllan",
  "title": "[ArXiv Endorsement Request]Towards an Internal Topology of Alignment: The PCE Framework"
}