Paper ID | AUD-12.6 | ||
Paper Title | PROTOTYPICAL NETWORKS FOR DOMAIN ADAPTATION IN ACOUSTIC SCENE CLASSIFICATION | ||
Authors | Shubhr Singh, Helen L. Bear, Emmanouil Benetos, Queen Mary University of London, United Kingdom | ||
Session | AUD-12: Detection and Classification of Acoustic Scenes and Events 1: Few-shot learning | ||
Location | Gather.Town | ||
Session Time: | Wednesday, 09 June, 15:30 - 16:15 | ||
Presentation Time: | Wednesday, 09 June, 15:30 - 16:15 | ||
Presentation | Poster | ||
Topic | Audio and Acoustic Signal Processing: [AUD-CLAS] Detection and Classification of Acoustic Scenes and Events | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Acoustic Scene Classification (ASC) refers to the task of assigning a semantic label to an audio stream that characterises the environment in which it was recorded. In recent times, Deep Neural Networks(DNNs) have emerged as the model of choice for ASC. However, in real world scenarios, domain adaptation remains a persistent problem for ASC models. In the search for an optimal solution to the said problem, we explore a metric learning approach called prototypical networks using the TUT Urban Acoustic Scenes dataset, which consists of 10 different acoustic scenes recorded across 10 cities. In order to replicate the domain adaptation scenario, we divide the dataset into source domain data consisting of data samples from eight randomly selected cities and target domain data consisting of data from the remaining two cities. We evaluate the performance of the net-work against a selected baseline network under various experimental scenarios and based on the results we conclude that metric learning is a promising approach towards addressing the domain adaptation problem in ASC. |