DETECTING AT LEAST ONE EMERGENCY VEHICLE USING A PERCEPTION ALGORITHM
DRIVE
August 22, 2024
For training a perception algorithm to detect an emergency vehicle, respective audio datasets are received from two microphones and respective spectrograms are generated. At least one interaural difference map is generated based on the spectrograms, audio source localization data is generated, which specifies a number of audio sources in respective grid cells of a spatial grid, by applying a CRNN to first input data containing the spectrograms and the least one interaural difference map. An image is received from a camera and output data comprising a bounding box for the emergency vehicle is predicted by applying at least one further ANN to second input data containing the image and the spectrograms. Network parameters are adapted depending on the output data and the audio source localization data.
Discussion in the ATmosphere