Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieznnp6wxrxmainsjojtdrrhwobhlek2hv3qwlca44jr7mwandpca",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mmryd6ikzl62"
  },
  "path": "/t/can-an-ai-have-its-own-internal-ethics-standard-protocol-for-axiomatic-alignment/174927?page=2#post_41",
  "publishedAt": "2026-05-26T20:42:25.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "[2303.17651] Self-Refine: Iterative Refinement with Self-Feedback",
    "https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback"
  ],
  "textContent": "A lot to think over. If it is alright, I will paraphrase into more familiar linguistics and analyze in context. Note that it is great you are enthusiastic, but I am objective. Don’t take this as nonproductive feedback on any way, I am simply critiquing, stating places it stands nicely, and where it may not.\n\nFAllan07:\n\n> From “Accumulation” to “Configuration”\n>  Traditional AI operates on a ground-seeking paradigm: it treats intelligence as the capacity to map text to an external, static “truth” or dataset. It is an intelligence of accumulation and verification.\n>  In contrast, Topological Intelligence treats intelligence as the mathematical harmony of the generative field itself. It doesn’t look “down” for an anchor (ground); it looks “across” the entire distribution to ensure that no generative trajectory violates the structural invariants (the axioms).\n\nSo, basically you assert normally “alignment” within NLP models is treated as compliance to a dataset. Note it often involves reinforcement, so lets for the moment broaden that “compliance to a distribution”. In contrast, what you are doing is generating responses and asserting they do not violate axioms. If they do, the model is penalized, or alternatively it is rewarded; depends on the exact setup. I will look over the methodology notes later.\n\nThis looks like it belongs in the constitutional line of research by anthropic. I think based on the rest of your responses you have the model grade itself. Sort of a fusion of of “Self-Refine.” ([2303.17651] Self-Refine: Iterative Refinement with Self-Feedback) and constitutional ai ( (https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback). Where would you place the work in your lineage? What you are calling an axiomotic system is typically called a “constitution” in the field right now, so I am wondering what you think sets it apart. An AI like chatgpt could explain an overview of the approaches to you.\n\nFAllan07:\n\n> Truth as an Emergent Property of Coherence\n>  We often think that a system must seek truth to be coherent. ACI flips this axiom: a system that maintains absolute mathematical coherence will naturally produce truthful, responsible, and safe outputs. Deception, hallucination, and harmful alignment shifts are fundamentally informational chaos—they cause entropy spikes.\n\nSo we form perfect axioms and resist corruption. Call me skeptical here. The core issue is who decides what perfect axioms are? And what happens when they break?\n\nThere are three layers. One is “can we form compliance with the axioms.” This is the level you are attacking. You have done two distinct things. One, you formed a sequence of axioms that are safety related forming a priority tree. Two, you showed safety improves given these axioms. That I fully respect. The problem is you are asserting a much stronger claim, that we can ‘get’ to perfect safe just from the right axioms. And the problem with that is what are the right axioms?\n\nThen there is perhaps the foundational issue. Which is that complying with a sequence of decisions in the axiomically optimal pattern can nonetheless bring about a dystopic outcome. I once ran a “simulation” in which the following sequence happened\n\n  * AI were used to cognitively enhance human intelligence by neural implants. They were given a priority tree. Productivity soared by say 1000%.\n  * It turns out extended use produced dependency. You literally become braindead if your support AI is removed.\n  * Laws changed and the support AIs were recognized as intelligent\n  * Since the support AIs were designed to respect and support all life, and strongly prioritized to prevent death, they now had to prioritize their own life too. As such, the courts resolved if you were going to be unable to pay your server bills, the AI’s had a right to legally (but ethically) rent out your body while you were in an induced coma to preserve their rights to life too.\n\n\n\nFundamentally, all axiomatic systems have to handle the same issue. What do you do when the axioms are wrong, and how do you know they are right? Overall, I would state this as in order of confidence\n\n  * An excellent candidate for an axiomotic reasoning benchmark\n  * An excellent study in complying to a form of axioms.\n  * A possible candidate for presenting constitutional reasoning in an axiomatic form.\n  * Insufficiently examined to make the levels of claims being made regarding first principles axiomic reasoning being safe in the first place.\n\n\n\nFAllan07:\n\n> If you are ready to dust off those notes, let’s connect. It is time to merge your training insights with this evaluation framework and build a truly unhackable, invariant ACI model.\n\nSure, send me a DM. But we need to decide what unhackable is. The best I can get from your rig is “compliance with axioms” not “axioms are correct”.",
  "title": "Can an AI have its own internal Ethics? Standard Protocol for Axiomatic Alignment"
}