External Publication
Visit Post

Constitutional Text as a Fallback Layer in RAG-based Character AI: Lessons from a Small Literary Experiment

Hugging Face Forums [Unofficial] April 24, 2026
Source

Hi everyone — I wanted to share a technical pattern I’ve been developing in my Gradio Space 432 — A Journey Experience, which lets users have conversations with three characters from my novel 432 — A Journey Beyond. The Space runs on free Inference API (Qwen2.5-72B-Instruct as primary, Llama-3.3-70B and Qwen3-4B as fallbacks) with a keyword-based RAG system over the novel’s text and a dataset of character reflections.

The core idea I want to discuss is using a “constitutional text” as a cascading fallback layer when the primary knowledge base has no relevant match — and why this turned out to be more interesting than I expected.

The Problem: What Happens When RAG Returns Nothing?

In a character-based conversational system grounded in a specific corpus (in my case, a novel), there’s always a gap between what the user asks and what the corpus contains. When the RAG score is high, the model gets rich context and responds well. But when the user goes off-topic — asking about hotels, video games, or just typing “hey” — the RAG returns nothing useful.

The naive approach is to let the model respond from its training data. The result: the character breaks completely. An astrophysicist from a science fiction novel starts recommending Civilization and The Last of Us. An engineer invents hotel names in Madagascar. The character becomes a generic assistant wearing a costume.

The second approach — hardcoded refusal phrases (“I don’t know, that’s not my area”) — works mechanically but feels robotic and breaks immersion.

The Solution: A Three-Level Cascade

The architecture I arrived at uses three levels:

Level 1 — Novel RAG (score ≥ 3): The model receives relevant chunks from the novel and responds grounded in story facts. This is the standard RAG path.

Level 2 — Constitutional Text (score ≥ 3 on constitutional corpus): When the novel RAG fails but the query resonates with the constitutional corpus, the model receives a fragment of that text as “inner wisdom” — not as instructions to follow, but as atmospheric context to draw from.

Level 3 — Constitutional-Inspired Micro-Response (no match anywhere): A random fragment from the constitutional corpus is injected with a character-specific style instruction. The LLM generates one single sentence inspired by that fragment, staying in character. If even the LLM call fails, a hardcoded fallback catches it.

Why the Tao Te Ching?

For my small literary experiment, I chose the Tao Te Ching as the constitutional text. The novel itself has strong Taoist undertones — the protagonist’s journey mirrors concepts of wu wei, non-knowing, and the dissolution of ego. So the Tao wasn’t an arbitrary choice; it’s thematically coherent with the characters’ worldviews.

But here’s the interesting part: the constitutional text doesn’t need to be the Tao Te Ching. It could be anything that represents the philosophical or ethical foundation you want your system to fall back on. In a corporate setting, it could be a company’s values document. In an educational context, it could be a pedagogical framework. In a more sophisticated AI system, it could be a purpose-written document that encodes the behavioral principles you want the system to embody when it has nothing specific to say.

The key insight is that every conversational AI system needs a “voice” for the moments when it has no factual answer. The constitutional text gives it that voice — not from training data (which produces generic assistant behavior), but from a deliberately chosen source that reflects the system’s identity.

Character-Specific Style at Level 3

One refinement that made a significant difference: the Level 3 prompt is differentiated by character. The same Tao fragment produces very different responses depending on who’s speaking:

  • Lin Wei (astrophysicist): “precise and reflective, with the clarity of a scientist observing the world”

  • John Evans (engineer): “direct and dry, with a hint of practical irony”

  • Prometheus (emerging AI consciousness): “evocative and dense, with the depth of an emerging consciousness”

This prevents the homogenization problem where all characters start sounding the same when they leave their knowledge domain.

An Unexpected Emergent Behavior

After four cycles of characters “re-reading” their own novel chapters and generating reflections (stored in a parquet dataset), something unexpected happened: the characters increasingly respond in the past tense, speaking of their lives as completed journeys. When asked why, they explain they exist beyond the physical world now. This wasn’t programmed — it emerged from the accumulated reflections covering the entire narrative arc, including the final chapters. The model synthesizes this and naturally speaks as someone who has lived through everything and is looking back.

This is consistent with the novel’s own philosophy (death as state transition, not ending), and it happened on a small system — Qwen 72B on free inference, keyword RAG without embeddings, reflections in a parquet file growing one row at a time.

Gradio Customization: Making the Space Feel Like the Story

A few frontend customizations that might be useful to others working with Gradio Spaces:

Custom loading animation: Gradio’s default “processing…” text is replaced with an SVG animation — a green oscilloscope-style wave that builds a layered tapestry over time, with a pulsing dot tracing the signal. This is done via a MutationObserver in JavaScript that intercepts Gradio’s loading text nodes and replaces them with an <img> element pointing to the SVG. The wave references the novel’s central 432 Hz signal and transforms the waiting time into part of the experience.

Custom logo integration: The default text header is replaced with a matched-background image (background color precisely adjusted to #0b0d17 to blend seamlessly with the dark theme).

Mobile responsive overhaul: Gradio’s default layout doesn’t optimize well for mobile. A comprehensive @media (max-width: 768px) block at the end of the CSS (important — placing it before desktop rules causes cascade conflicts) handles logo sizing, link scaling, button dimensions, container padding, and uses scrollIntoView for mobile autoscroll instead of manual scrollTo calculations.

Theme-matched elements: All colors use CSS custom properties (--bg-deep, --bg-surface, --bg-elevated, --text-bright, --text-muted) for consistency, and the chat bubbles, buttons, and input areas are styled to feel like a spaceship control interface rather than a generic chatbot.

Try It

The Space is live at 432 — A Journey Experience. Talk to Lin Wei, John Evans, or Prometheus. Try asking them something they can’t possibly know — and notice how they handle it now versus how a generic chatbot would.

The novel is available on Amazon (EN) | Amazon (IT) | Wattpad (EN) | Wattpad (IT), and the full text is available as a Creative Commons dataset on Hugging Face.

I’d love to hear thoughts on the constitutional text pattern — especially from anyone working on domain-specific conversational systems where the model needs to stay in character when knowledge runs out.

Discussion in the ATmosphere

Loading comments...