2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDAUD-9.3
Paper Title SINGING MELODY EXTRACTION FROM POLYPHONIC MUSIC BASED ON SPECTRAL CORRELATION MODELING
Authors Xingjian Du, Bilei Zhu, Qiuqiang Kong, Zejun Ma, Bytedance AI Lab, China
SessionAUD-9: Music Information Retrieval and Music Language Processing 1: Beat and Melody
LocationGather.Town
Session Time:Wednesday, 09 June, 14:00 - 14:45
Presentation Time:Wednesday, 09 June, 14:00 - 14:45
Presentation Poster
Topic Audio and Acoustic Signal Processing: [AUD-MIR] Music Information Retrieval and Music Language Processing
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Convolutional neural network (CNN) based methods have achieved state-of-the-art performance for singing melody extraction from polyphonic music. However, most of these methods focus on the learning of local features, while relationships among spectral components locating far apart are often neglected. In this paper, we explore the idea of modeling spectral correlation explicitly for melody extraction. Specifically, we present a spectral correlation module (SCM) that can learn to model the relationships among all frequency bands in a time-frequency representation, thus allowing the encoding of global spectral information into a conventional CNN. Furthermore, we propose to integrate center frequencies with the input feature map of SCM to improve the performance. We implement a light-weight model comprised of SCM blocks to verify the efficacy of our system. Our system achieves a state-of-the-art overall accuracy of 83.5% on the MedleyDB dataset.