Paper ID | AUD-3.2 |
Paper Title |
Differentiable Signal Processing With Black-Box Audio Effects |
Authors |
Marco A Martínez Ramírez, Queen Mary University of London, United Kingdom; Oliver Wang, Paris Smaragdis, Adobe Inc., United States; Nicholas J. Bryan, Adobe Research, United States |
Session | AUD-3: Music Signal Analysis, Processing, and Synthesis 1: Deep Learning |
Location | Gather.Town |
Session Time: | Tuesday, 08 June, 14:00 - 14:45 |
Presentation Time: | Tuesday, 08 June, 14:00 - 14:45 |
Presentation |
Poster
|
Topic |
Audio and Acoustic Signal Processing: [AUD-MSP] Music Signal Analysis, Processing and Synthesis |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
We present a data-driven approach to automate audio signal processing by incorporating stateful third-party, audio effects as layers within a deep neural network. We then train a deep encoder to analyze input audio and control effect parameters to perform the desired signal manipulation, requiring only input-target paired audio data as supervision. To train our network with non-differentiable black-box effects layers, we use a fast, parallel stochastic gradient approximation scheme within a standard auto differentiation graph, yielding efficient end-to-end backpropagation. We demonstrate the power of our approach with three separate automatic audio production applications: tube amplifier emulation, automatic removal of breaths and pops from voice recordings, and automatic music mastering. We validate our results with a subjective listening test, showing our approach not only can enable new automatic audio effects tasks, but can yield results comparable to a specialized, state-of-the-art commercial solution for music mastering. |