Semi-Automated Music Catalog Curation Using Audio and Metadata

Abstract

We present a system to assist Subject Matter Experts (SMEs) to curate large online music catalogs. The system detects releases that are incorrectly attributed to an artist discography (misattribution), when the discography of a single artist is incorrectly separated (duplication), and predicts suitable relocations of misattributed releases. We use historical discography corrections to train and evaluate our system’s component models. These models combine vector representations of audio with metadata-based features, which outperform models based on audio or metadata alone. We conduct three experiments with SMEs in which our system detects misattribution in artist discographies with precision greater than 77%, duplication with precision greater than 71%, and by combining the approaches, predicts a correct relocation for misattributed releases with precision up to 45%. These results demonstrate the potential of such proactive curation systems in saving valuable human time and effort by directing attention where it is most needed.

Related

November 2023 | ACM TORS

Unbiased Identification of Broadly Appealing Content Using a Pure Exploration Infinitely-Armed Bandit Strategy

Maryam Aziz, Jesse Anderton, Kevin Jamieson, Alice Wang, Hugues Bouchard, Javed Aslam

October 2023 | CIKM

Exploiting Sequential Music Preferences via Optimisation-Based Sequencing

Dmitrii Moor, Yi Yuan, Rishabh Mehrotra, Zhenwen Dai, Mounia Lalmas

October 2023 | CIKM

Graph Learning for Exploratory Query Suggestions in an Instant Search System

Enrico Palumbo, Andreas Damianou, Alice Wang, Alva Liu, Ghazal Fazelnia, Francesco Fabbri, Rui Ferreira, Fabrizio Silvestri, Hugues Bouchard, Claudia Hauff, Mounia Lalmas, Ben Carterette, Praveen Chandar, David Nyhan