Paper ID | AUD-3.2 | ||
Paper Title | Differentiable Signal Processing With Black-Box Audio Effects | ||
Authors | Marco A Martínez Ramírez, Queen Mary University of London, United Kingdom; Oliver Wang, Paris Smaragdis, Adobe Inc., United States; Nicholas J. Bryan, Adobe Research, United States | ||
Session | AUD-3: Music Signal Analysis, Processing, and Synthesis 1: Deep Learning | ||
Location | Gather.Town | ||
Session Time: | Tuesday, 08 June, 14:00 - 14:45 | ||
Presentation Time: | Tuesday, 08 June, 14:00 - 14:45 | ||
Presentation | Poster | ||
Topic | Audio and Acoustic Signal Processing: [AUD-MSP] Music Signal Analysis, Processing and Synthesis | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | We present a data-driven approach to automate audio signal processing by incorporating stateful third-party, audio effects as layers within a deep neural network. We then train a deep encoder to analyze input audio and control effect parameters to perform the desired signal manipulation, requiring only input-target paired audio data as supervision. To train our network with non-differentiable black-box effects layers, we use a fast, parallel stochastic gradient approximation scheme within a standard auto differentiation graph, yielding efficient end-to-end backpropagation. We demonstrate the power of our approach with three separate automatic audio production applications: tube amplifier emulation, automatic removal of breaths and pops from voice recordings, and automatic music mastering. We validate our results with a subjective listening test, showing our approach not only can enable new automatic audio effects tasks, but can yield results comparable to a specialized, state-of-the-art commercial solution for music mastering. |