SYSTEMS AND METHODS FOR TRAINING AND EVALUATING MULTIMODAL NEURAL NETWORK BASED LANGUAGE MODELS
DRIVE
December 11, 2025
Embodiments described herein provide a method of building an artificial intelligence (AI) agent to respond to a task request from a user. The method includes: receiving a set of single-modal data samples of a plurality of modalities; selecting a first single-modal data sample of a first modality and a second single-modal data sample of a second modality; generating a question associated with the first single-modal data sample and the second single-modal data sample; generating an answer with a reasoning to the question based on a second input prompt; training, a second neural network based language model, using a dataset comprising the question and the answer to generate a candidate answer in response to a training query; building the AI conversation bot through an application programming interface to the trained second neural network language model; and generating, using the AI conversation bot, a response to the task request.
Discussion in the ATmosphere