2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

AUD-22: Detection and Classification of Acoustic Scenes and Events 3: Multimodal Scenes and Events

Session Type: Poster
Time: Thursday, 10 June, 15:30 - 16:15
Location: Gather.Town
 
AUD-22.1: LEARNING CONTEXTUAL TAG EMBEDDINGS FOR CROSS-MODAL ALIGNMENT OF AUDIO AND TAGS
         Xavier Favory; Music Technology Group, Universitat Pompeu Fabra
         Konstantinos Drossos; Audio Research Group, Tampere University
         Tuomas Virtanen; Audio Research Group, Tampere University
         Xavier Serra; Music Technology Group, Universitat Pompeu Fabra
 
AUD-22.2: EFFICIENT END-TO-END AUDIO EMBEDDINGS GENERATION FOR AUDIO CLASSIFICATION ON TARGET APPLICATIONS
         Paulo Lopez-Meyer; Intel Corporation
         Juan A. Del Hoyo Ontiveros; Intel Corporation
         Hong Lu; Intel Corporation
         Georg Stemmer; Intel Corporation
 
AUD-22.3: TEXT-TO-AUDIO GROUNDING: BUILDING CORRESPONDENCE BETWEEN CAPTIONS AND SOUND EVENTS
         Xuenan Xu; Shanghai Jiao Tong University
         Heinrich Dinkel; Shanghai Jiao Tong University
         Mengyue Wu; Shanghai Jiao Tong University
         Yu Kai; Shanghai Jiao Tong University
 
AUD-22.4: MULTI-VIEW AUDIO AND MUSIC CLASSIFICATION
         Huy Phan; Queen Mary University of London
         Huy Le Nguyen; HCM City University of Technology
         Oliver Chén; University of Oxford
         Lam Pham; University of Surrey
         Philipp Koch; University of Lübeck
         Ian McLoughlin; Singapore Institute of Technology
         Alfred Mertins; University of Lübeck
 
AUD-22.5: AUDIO-VISUAL EVENT RECOGNITION THROUGH THE LENS OF ADVERSARY
         Juncheng Li; Carnegie Mellon University
         Kaixin Ma; Carnegie Mellon University
         Shuhui Qu; Stanford University
         Po-Yao Huang; Carnegie Mellon University
         Florian Metze; Carnegie Mellon University
 
AUD-22.6: DCASENET: AN INTEGRATED PRETRAINED DEEP NEURAL NETWORK FOR DETECTING AND CLASSIFYING ACOUSTIC SCENES AND EVENTS
         Jee-weon Jung; University of Seoul
         Hye-jin Shim; University of Seoul
         Ju-ho Kim; University of Seoul
         Ha-Jin Yu; University of Seoul