Semi-Automated Music Catalog Curation Using Audio and Metadata


We present a system to assist Subject Matter Experts (SMEs) to curate large online music catalogs. The system detects releases that are incorrectly attributed to an artist discography (misattribution), when the discography of a single artist is incorrectly separated (duplication), and predicts suitable relocations of misattributed releases. We use historical discography corrections to train and evaluate our system’s component models. These models combine vector representations of audio with metadata-based features, which outperform models based on audio or metadata alone. We conduct three experiments with SMEs in which our system detects misattribution in artist discographies with precision greater than 77%, duplication with precision greater than 71%, and by combining the approaches, predicts a correct relocation for misattributed releases with precision up to 45%. These results demonstrate the potential of such proactive curation systems in saving valuable human time and effort by directing attention where it is most needed.


May 2024 | Yijun Tian, Maryam Aziz, Alice Wang, Enrico Palumbo and Hugues Bouchard

Structural Podcast Content Modeling with Generalizability

Yijun Tian, Maryam Aziz, Alice Wang, Enrico Palumbo and Hugues Bouchard

May 2024 | The Web Conference

Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

Marco De Nadai, Francesco Fabbri, Paul Gigioli, Alice Wang, Ang Li, Fabrizio Silvestri, Laura Kim, Shawn Lin, Vladan Radosavljevic, Sandeep Ghael, David Nyhan, Hugues Bouchard, Mounia Lalmas-Roelleke, Andreas Damianou

May 2024 | The Web Conference (GFM workshop)

Towards Graph Foundation Models for Personalization

Andreas Damianou, Francesco Fabbri, Paul Gigioli, Marco De Nadai, Alice Wang, Enrico Palumbo, Mounia Lalmas