External Publication
Visit Post

How to make 2 ai vtubers talk to eachother in vtube studio?

Hugging Face Forums [Unofficial] March 22, 2026
Source

Hmm…


For your setup with two VTube Studio instances on different ports , the best fix is not “make another mouth input parameter” first. The better fix is:

  • one VTube Studio instance per AI
  • one websocket client per instance
  • one token file per client
  • one stable plugin identity per client
  • one mouth stream sent only to that instance’s socket

That matches how VTube Studio is designed. The API starts on port 8001 by default, and if that port is already taken, the next instance moves to 8002, then 8003, and so on. VTube Studio also exposes a unique instanceID and windowTitle for each running instance, so multi-instance control is a normal supported use case. (GitHub)

The core problem

You are dealing with three separate layers at once:

  1. instance routing Which VTube Studio window are you actually connected to.

  2. authentication state Which saved token belongs to which plugin identity.

  3. parameter ownership Which process is currently allowed to control MouthOpen.

When those three are not kept separate, the result looks random. Mouth works for a bit, then stops. One model moves when the other should. A token seems fine, then “breaks.” That pattern matches the official API behavior very closely. (GitHub)

Why the token files feel broken

VTube Studio tokens are not generic. The official API says you only need to request a token once, then reuse it on later sessions. But pluginName and pluginDeveloper in the authentication request must match the values used when the token was created, or authentication fails. (GitHub)

That means these patterns will break things:

  • both AI clients writing to the same auth-token.txt
  • changing the plugin name during testing
  • changing the developer name during testing
  • re-requesting tokens over and over instead of reusing the saved one

This is also why wrapper libraries emphasize persistent token storage. VTubeStudioJS requires authTokenGetter and authTokenSetter to persist the token, and pyvts explicitly reads and writes a token file for future runs. (GitHub)

Why the mouth works “only for a short period”

That part is also documented.

When you use InjectParameterDataRequest, VTube Studio says you must re-send data for a parameter at least once every second or that parameter is considered “lost,” and control falls back to whatever was controlling it before, or to default if nothing else is controlling it. (GitHub)

VTube Studio also says that only one API plugin can write to one parameter at a time in normal "set" mode. If another plugin is already controlling that parameter, an error is returned. Only "add" mode can be shared by multiple plugins, and that is not what you usually want for lipsync. (GitHub)

So if either of these happens:

  • your client stops sending MouthOpen often enough
  • the wrong client touches MouthOpen in the same instance
  • some reconnect logic briefly grabs the same parameter

then the behavior will feel unstable even though VTube Studio is behaving normally. (GitHub)

Should you add another mouth input parameter

For two separate VTube Studio instances , usually no.

If you have:

  • WAFFLE LOVING GOOBER on port 8001
  • hyori on port 8002

then each instance already has its own copy of MouthOpen. The issue is almost certainly not that both models need separate mouth parameters. The issue is that the routing and auth are not isolated enough. VTube Studio also gives you CurrentModelRequest, which returns modelLoaded, modelName, and modelID, so your code can verify the loaded model before it starts sending mouth values. (GitHub)

Adding a custom mouth parameter becomes useful mainly in this different case:

  • both characters are inside the same VTube Studio instance
  • or you intentionally want separate rig-level blending logic

That is because injected parameters are used by the loaded model and any loaded Live2D items in that same instance. (GitHub)

There is also a downside to custom parameters. VTube Studio stores them in custom_parameters.json, and if a plugin’s auth token is revoked, the custom parameters created by that plugin are deleted. So if your token handling is already unstable, custom mouth parameters can make the system more fragile, not less. (GitHub)

What I would do in your case

I would keep the standard MouthOpen and fix the architecture.

Good architecture

AI 1

  • connects only to ws://127.0.0.1:8001
  • uses plugin identity like Goober Controller / YourName
  • stores token in tokens/goober.token

AI 2

  • connects only to ws://127.0.0.1:8002
  • uses plugin identity like Hyori Controller / YourName
  • stores token in tokens/hyori.token

Then each client should do this on startup:

  1. connect to its assigned port
  2. authenticate using its own saved token
  3. if auth fails, request a new token and save it to that client’s own file
  4. call CurrentModelRequest
  5. verify the loaded modelName or modelID is the expected one
  6. only then start sending MouthOpen updates
  7. keep sending while that AI is speaking (GitHub)

That is the clean solution.

What not to do

Do not do this:

  • one shared token file for both AIs
  • one generic plugin name reused across experiments
  • “request a new token every run”
  • “send mouth once and assume it sticks”
  • “add a second mouth parameter before isolating ports and auth”

Those choices are exactly the kind of thing that creates the unstable behavior you described. (GitHub)

The easier and usually more stable option

For AI VTubers, the most practical answer is often:

do not driveMouthOpen manually at all. Route each AI’s TTS audio into its own VTube Studio instance and let VTube Studio do the lipsync.

VTube Studio officially supports microphone-based lipsync. It can derive mouth movement from audio with VoiceVolume, VoiceVolumePlusMouthOpen, and the vowel parameters VoiceA, VoiceI, VoiceU, VoiceE, and VoiceO. The official docs recommend Advanced Lipsync rather than the legacy simple mode. (GitHub)

This is also how a lot of AI VTuber projects are set up in practice. The Neuro project routes TTS output into VTube Studio through a virtual audio cable and lets VTube Studio handle lipsync. The vtuber-waifu project tells users to capture the program’s audio with Virtual Cable and use that as VTube Studio microphone input. MITSUHA gives a similar VB-Cable setup. (GitHub)

For your use case, that approach is often better because:

  • speech audio is already the truth for timing
  • you do not need to stream mouth values constantly
  • you avoid parameter ownership fights
  • you reduce the amount of mouth-specific code

That is the path I would choose first if your goal is simply “make two AI characters talk reliably.” (GitHub)

One more thing to check

Sometimes the code is fine and the model mapping is the real issue.

VTube Studio’s model settings say each output Live2D parameter can only be chosen once. They also note that if a parameter appears to move in config but the model does not visibly move, the likely cause is an expression, animation, or physics system overwriting it. (GitHub)

So for each model, verify that:

  • MouthOpen is mapped to the intended Live2D mouth-open parameter
  • that output parameter is not mapped twice
  • no expression or motion is locking the mouth
  • no physics setup is masking the visible movement (GitHub)

My direct recommendation

For your exact case , do this:

Best overall

Use two VTube Studio instances and let each one handle lipsync from its own routed TTS audio. Keep API usage for expressions, hotkeys, scene control, or model control. (GitHub)

If you want to keep API mouth control

Still use two instances , but give each AI:

  • its own port
  • its own plugin identity
  • its own token file
  • a model check with CurrentModelRequest
  • a continuous mouth-update loop while speaking (GitHub)

Do not start by adding another mouth parameter

That is only the right move if both characters are in the same VTube Studio instance or you specifically want custom rig logic. In your two-port setup, it is probably solving the wrong problem. (GitHub)

Discussion in the ATmosphere

Loading comments...