Learning a large scale vocal similarity embedding for music

Abstract

This work describes an approach for modeling singing voice at scale by learning lowdimensional vocal embeddings from large collections of recorded music. We derive embeddings for different representations of the voice with genre labels. We evaluate on both objective (ranked retrieval) and subjective (perceptual evaluation) tasks. We conclude with a summary of our ongoing effort to crowdsource vocal style tags to refine our model.

Related

August 2020 | KDD

Bandit based Optimization of Multiple Objectives on a Music Streaming Platform

Rishabh Mehrotra, Niannan Xue, Mounia Lalmas

August 2020 | Uncertainty in Artificial Intelligence (UAI)

Stochastic Variational Inference for Dynamic Correlated Topic Models

Federico Tomasi, Praveen Chandar, Gal Levy-Fix, Mounia Lalmas, Zhenwen Dai