SYSTEMS AND METHODS FOR TRAINING A CAMERA-BASED PERCEPTION MODEL USING MACHINE LEARNING
DRIVE
June 25, 2026
Systems and methods include detecting obstacles and drivable areas by an autonomous vehicle by inputting image and map data into a neural network to extract feature vectors. A transformer encoder converts these vectors from camera space to Bird's Eye View (BEV) space. A detection head identifies objects, and a segmentation head generates a BEV map showing objects and drivable surfaces. Attributes from both heads are compared, and the segmentation head's weights are updated accordingly, resulting in an updated BEV segmentation map output by the updated segmentation head.
Discussion in the ATmosphere