Paper ID | AUD-20.3 | ||
Paper Title | INSTRUMENT CLASSIFICATION OF SOLO SHEET MUSIC IMAGES | ||
Authors | Kevin Ji, Daniel Yang, TJ Tsai, Harvey Mudd College, United States | ||
Session | AUD-20: Music Information Retrieval and Music Language Processing 3: Topics in Music Information Retrieval | ||
Location | Gather.Town | ||
Session Time: | Thursday, 10 June, 14:00 - 14:45 | ||
Presentation Time: | Thursday, 10 June, 14:00 - 14:45 | ||
Presentation | Poster | ||
Topic | Audio and Acoustic Signal Processing: [AUD-MSP] Music Signal Analysis, Processing and Synthesis | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | This paper studies instrument classification of solo sheet music. Whereas previous work has focused on instrument recognition in audio data, we instead approach the instrument classification problem using raw sheet music images. Our approach first converts the sheet music image into a sequence of musical "words" based on the bootleg score representation, and then treats the problem as a text classification task. We show that it is possible to significantly improve classifier performance by training a language model on unlabeled data, initializing a classifier with the pretrained language model weights, and then finetuning the classifier on labeled data. In this work, we train AWD-LSTM, GPT-2, and RoBERTa models on solo sheet music images from IMSLP for eight different instruments. We find that GPT-2 and RoBERTa slightly outperform AWD-LSTM, and that pretraining increases classification accuracy for RoBERTa from 34.5% to 42.9%. Furthermore, we propose two data augmentation methods that increase classification accuracy for RoBERTa by an additional 15%. |