Machine Learning for Spectrogram and Audio Signal Restoration

R. Huang and M. Mattheakis
Rensslaer Polytechnic Institute, New York, United States

Keywords: RF spectrogram, audio signal inpainting, recurrent neural network, reservoir computing, optical machine learning

There has been a wave of interests in applying machine learning to restore corrupted spectrograms or missing parts in audio signals. We propose a method for restoring large corrupted areas in spectrograms by using observer-based reservoir computing (RC). Conventionally, deep convolutional neural networks (CNNs) are used for image inpainting. There are a few challenges in using CNNs for image restoration. (1) Deep CNNs require a big set of images for the training, which is not always available in real world deployment. (2) In using CNNs for spectrogram restoration or missing audio signal restoration, any temporal correlation would be lost after a few convolution operations. The RC provides an alternative avenue for image restoration that avoids the aforementioned challenges. A RC network can be trained on an individual spectrogram and restoration doesn't rely on other images. The recurrent architecture of a RC retains any temporal information stored in the data. A Bayesian optimization and a grid-search approach are used to determine the optimal set of network parameters. We show that RC method outperforms cubic interpolation when the corrupted area is reasonably large. The proposed approach is quite general method and thus, it can be employed for a general image restoration.