Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihpnjkd7nhsza22qun4dhz55zhkkzoa5ujriuwtipjsijbodvpp6u",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mhno54mnntp2"
  },
  "path": "/t/how-to-make-2-ai-vtubers-talk-to-eachother-in-vtube-studio/174490#post_2",
  "publishedAt": "2026-03-22T09:46:06.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "GitHub"
  ],
  "textContent": "Hmm…\n\n* * *\n\nFor **your setup with two VTube Studio instances on different ports** , the best fix is **not** “make another mouth input parameter” first. The better fix is:\n\n  * one VTube Studio instance per AI\n  * one websocket client per instance\n  * one token file per client\n  * one stable plugin identity per client\n  * one mouth stream sent only to that instance’s socket\n\n\n\nThat matches how VTube Studio is designed. The API starts on port `8001` by default, and if that port is already taken, the next instance moves to `8002`, then `8003`, and so on. VTube Studio also exposes a unique `instanceID` and `windowTitle` for each running instance, so multi-instance control is a normal supported use case. (GitHub)\n\n## The core problem\n\nYou are dealing with **three separate layers** at once:\n\n  1. **instance routing**\nWhich VTube Studio window are you actually connected to.\n\n  2. **authentication state**\nWhich saved token belongs to which plugin identity.\n\n  3. **parameter ownership**\nWhich process is currently allowed to control `MouthOpen`.\n\n\n\n\nWhen those three are not kept separate, the result looks random. Mouth works for a bit, then stops. One model moves when the other should. A token seems fine, then “breaks.” That pattern matches the official API behavior very closely. (GitHub)\n\n## Why the token files feel broken\n\nVTube Studio tokens are not generic. The official API says you only need to request a token once, then reuse it on later sessions. But `pluginName` and `pluginDeveloper` in the authentication request **must match** the values used when the token was created, or authentication fails. (GitHub)\n\nThat means these patterns will break things:\n\n  * both AI clients writing to the same `auth-token.txt`\n  * changing the plugin name during testing\n  * changing the developer name during testing\n  * re-requesting tokens over and over instead of reusing the saved one\n\n\n\nThis is also why wrapper libraries emphasize persistent token storage. VTubeStudioJS requires `authTokenGetter` and `authTokenSetter` to persist the token, and `pyvts` explicitly reads and writes a token file for future runs. (GitHub)\n\n## Why the mouth works “only for a short period”\n\nThat part is also documented.\n\nWhen you use `InjectParameterDataRequest`, VTube Studio says you must re-send data for a parameter **at least once every second** or that parameter is considered “lost,” and control falls back to whatever was controlling it before, or to default if nothing else is controlling it. (GitHub)\n\nVTube Studio also says that only **one API plugin** can write to one parameter at a time in normal `\"set\"` mode. If another plugin is already controlling that parameter, an error is returned. Only `\"add\"` mode can be shared by multiple plugins, and that is not what you usually want for lipsync. (GitHub)\n\nSo if either of these happens:\n\n  * your client stops sending `MouthOpen` often enough\n  * the wrong client touches `MouthOpen` in the same instance\n  * some reconnect logic briefly grabs the same parameter\n\n\n\nthen the behavior will feel unstable even though VTube Studio is behaving normally. (GitHub)\n\n## Should you add another mouth input parameter\n\nFor **two separate VTube Studio instances** , usually **no**.\n\nIf you have:\n\n  * `WAFFLE LOVING GOOBER` on port `8001`\n  * `hyori` on port `8002`\n\n\n\nthen each instance already has its own copy of `MouthOpen`. The issue is almost certainly **not** that both models need separate mouth parameters. The issue is that the routing and auth are not isolated enough. VTube Studio also gives you `CurrentModelRequest`, which returns `modelLoaded`, `modelName`, and `modelID`, so your code can verify the loaded model before it starts sending mouth values. (GitHub)\n\nAdding a custom mouth parameter becomes useful mainly in this different case:\n\n  * both characters are inside the **same VTube Studio instance**\n  * or you intentionally want separate rig-level blending logic\n\n\n\nThat is because injected parameters are used by the loaded model and any loaded Live2D items in that same instance. (GitHub)\n\nThere is also a downside to custom parameters. VTube Studio stores them in `custom_parameters.json`, and if a plugin’s auth token is revoked, the custom parameters created by that plugin are deleted. So if your token handling is already unstable, custom mouth parameters can make the system more fragile, not less. (GitHub)\n\n## What I would do in your case\n\nI would keep the standard `MouthOpen` and fix the architecture.\n\n### Good architecture\n\n**AI 1**\n\n  * connects only to `ws://127.0.0.1:8001`\n  * uses plugin identity like `Goober Controller` / `YourName`\n  * stores token in `tokens/goober.token`\n\n\n\n**AI 2**\n\n  * connects only to `ws://127.0.0.1:8002`\n  * uses plugin identity like `Hyori Controller` / `YourName`\n  * stores token in `tokens/hyori.token`\n\n\n\nThen each client should do this on startup:\n\n  1. connect to its assigned port\n  2. authenticate using its own saved token\n  3. if auth fails, request a new token and save it to that client’s own file\n  4. call `CurrentModelRequest`\n  5. verify the loaded `modelName` or `modelID` is the expected one\n  6. only then start sending `MouthOpen` updates\n  7. keep sending while that AI is speaking (GitHub)\n\n\n\nThat is the clean solution.\n\n## What not to do\n\nDo not do this:\n\n  * one shared token file for both AIs\n  * one generic plugin name reused across experiments\n  * “request a new token every run”\n  * “send mouth once and assume it sticks”\n  * “add a second mouth parameter before isolating ports and auth”\n\n\n\nThose choices are exactly the kind of thing that creates the unstable behavior you described. (GitHub)\n\n## The easier and usually more stable option\n\nFor AI VTubers, the most practical answer is often:\n\n**do not drive`MouthOpen` manually at all.**\nRoute each AI’s TTS audio into its own VTube Studio instance and let VTube Studio do the lipsync.\n\nVTube Studio officially supports microphone-based lipsync. It can derive mouth movement from audio with `VoiceVolume`, `VoiceVolumePlusMouthOpen`, and the vowel parameters `VoiceA`, `VoiceI`, `VoiceU`, `VoiceE`, and `VoiceO`. The official docs recommend **Advanced Lipsync** rather than the legacy simple mode. (GitHub)\n\nThis is also how a lot of AI VTuber projects are set up in practice. The Neuro project routes TTS output into VTube Studio through a virtual audio cable and lets VTube Studio handle lipsync. The vtuber-waifu project tells users to capture the program’s audio with Virtual Cable and use that as VTube Studio microphone input. MITSUHA gives a similar VB-Cable setup. (GitHub)\n\nFor your use case, that approach is often better because:\n\n  * speech audio is already the truth for timing\n  * you do not need to stream mouth values constantly\n  * you avoid parameter ownership fights\n  * you reduce the amount of mouth-specific code\n\n\n\nThat is the path I would choose first if your goal is simply “make two AI characters talk reliably.” (GitHub)\n\n## One more thing to check\n\nSometimes the code is fine and the model mapping is the real issue.\n\nVTube Studio’s model settings say each output Live2D parameter can only be chosen once. They also note that if a parameter appears to move in config but the model does not visibly move, the likely cause is an expression, animation, or physics system overwriting it. (GitHub)\n\nSo for each model, verify that:\n\n  * `MouthOpen` is mapped to the intended Live2D mouth-open parameter\n  * that output parameter is not mapped twice\n  * no expression or motion is locking the mouth\n  * no physics setup is masking the visible movement (GitHub)\n\n\n\n## My direct recommendation\n\nFor **your exact case** , do this:\n\n### Best overall\n\nUse **two VTube Studio instances** and let each one handle lipsync from its own routed TTS audio. Keep API usage for expressions, hotkeys, scene control, or model control. (GitHub)\n\n### If you want to keep API mouth control\n\nStill use **two instances** , but give each AI:\n\n  * its own port\n  * its own plugin identity\n  * its own token file\n  * a model check with `CurrentModelRequest`\n  * a continuous mouth-update loop while speaking (GitHub)\n\n\n\n### Do not start by adding another mouth parameter\n\nThat is only the right move if both characters are in the **same** VTube Studio instance or you specifically want custom rig logic. In your two-port setup, it is probably solving the wrong problem. (GitHub)",
  "title": "How to make 2 ai vtubers talk to eachother in vtube studio?"
}