2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information
Login Paper Search My Schedule Paper Index Help

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDAUD-31.2
Paper Title A NEW DCASE 2017 RARE SOUND EVENT DETECTION BENCHMARK UNDER EQUAL TRAINING DATA: CRNN WITH MULTI-WIDTH KERNELS
Authors Jan Baumann, Patrick Meyer, Timo Lohrenz, Technische Universität Braunschweig, Germany; Alexander Roy, Michael Papendieck, IAV GmbH, Germany; Tim Fingscheidt, Technische Universität Braunschweig, Germany
SessionAUD-31: Detection and Classification of Acoustic Scenes and Events 6: Events
LocationGather.Town
Session Time:Friday, 11 June, 13:00 - 13:45
Presentation Time:Friday, 11 June, 13:00 - 13:45
Presentation Poster
Topic Audio and Acoustic Signal Processing: [AUD-CLAS] Detection and Classification of Acoustic Scenes and Events
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Rare sound event detection (rare SED) deals with obtaining valuable information from data consisting mostly of acoustic background noises. It has meanwhile a long research history and was part of the DCASE 2017 Challenge. State-of-the-art performance is currently reached using a stacked combination of a CNN and an RNN, dubbed CRNN, which was also successfully applied in other domains such as in hybrid automatic speech recognition. In this work, we propose a new CRNN model for rare SED. This new model uses a set of parallel convolutions with multiple kernel widths in the CRNN and is based on an extended feature representation of the log-mel spectrogram. Furthermore, we apply and optimize different evaluation postprocessing methods and analyze the modifications in an ablation study. The proposed model outperforms the so-far top-scoring networks of the DCASE Challenge - using the same training material for all methods - by an error rate of 6.13% absolute and by 4.39% absolute in the F1 score on the test set and under these conditions achieves a new benchmark result on the DCASE 2017 Rare SED data set.