2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDAUD-23.5
Paper Title ENHANCING AUDIO AUGMENTATION METHODS WITH CONSISTENCY LEARNING
Authors Turab Iqbal, University of Surrey, United Kingdom; Karim Helwani, Arvindh Krishnaswamy, Amazon Web Services, United States; Wenwu Wang, University of Surrey, United Kingdom
SessionAUD-23: Detection and Classification of Acoustic Scenes and Events 4: Datasets and metrics
LocationGather.Town
Session Time:Thursday, 10 June, 15:30 - 16:15
Presentation Time:Thursday, 10 June, 15:30 - 16:15
Presentation Poster
Topic Audio and Acoustic Signal Processing: [AUD-CLAS] Detection and Classification of Acoustic Scenes and Events
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Data augmentation is an inexpensive way to increase training data diversity, and is commonly achieved via transformations of existing data. For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss. This paper investigates the use of training objectives that explicitly impose this consistency constraint, and how it can impact downstream audio classification tasks. In the context of deep convolutional neural networks in the supervised setting, we show empirically that certain measures of consistency are not implicitly captured by the cross-entropy loss, and that incorporating such measures into the loss function can improve the performance of tasks such as audio tagging. Put another way, we demonstrate how existing augmentation methods can further improve learning by enforcing consistency.