VEHICLE GUIDANCE BY A MULTIMODAL LARGE LANGUAGE MODEL
DRIVE
June 25, 2026
Techniques for determining text for a vehicle to navigate in an environment are described herein. For example, the techniques may include a vehicle computing device of an autonomous vehicle transmitting data to a remote computing device that implements a multimodal large language model to determine text data that enables the vehicle to navigate in problematic situations. The multimodal large language model can receive image data, map, data, sensor data, and/or other data from the vehicle (or database associated with therewith) and output text describing a solution for navigating relative to an event. The text output by the multimodal large language model can be transmitted to the vehicle computing device for predicting a vehicle trajectory (or another action) for the autonomous vehicle to follow at a future time.
Discussion in the ATmosphere