Revisiting Singing Voice Detection: A Quantitative Review and the Future Outlook

Abstract

Since the vocal component plays a crucial role in popular music, singing voice detection has been an active research topic in music information retrieval. Although several proposed algorithms have shown high performances, we argue that there still is a room to improve to build a more robust singing voice detection system. In order to identify the area of improvement, we first perform an error analysis on three recent singing voice detection systems. Based on the analysis, we design novel methods to test the systems on multiple sets of internally curated and generated data to further examine the pitfalls, which are not clearly revealed with the current datasets. From the experiment results, we also propose several directions towards building a more robust singing voice detector.

Related

July 2020 | IJCAI - International Joint Conference on Artificial Intelligence

Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling

Daniel Stoller, Mi Tian, Sebastian Ewert, and Simon Dixon

July 2020 | WCCI/IJCNN - IEEE World Congress on Computational Intelligence / International Joint Conference on Neural Networks

Using a Neural Network Codec Approximation Loss to Improve Source Separation Performance in Limited Capacity Networks

Ishwarya Ananthabhotla, Sebastian Ewert, Joseph A. Paradiso

November 2019 | ISMIR

mirdata: Software for Reproducible Usage of Datasets

Rachel M Bittner, Magdalena Fuentes, David Rubinstein, Andreas Jansson, Keunwoo Choi, Thor Kell