Structural Podcast Content Modeling with Generalizability

Abstract

Podcast content modeling is crucial for a variety of practical web uses, such as the recommendation and classification of podcasts. However, previous studies on podcast content modeling rely on task-specific datasets to train dedicated models for each downstream application, which are labels heavily dependent and the learned representations are non-generalizable across different tasks. In addition, the rich and intricate structural information among users, podcasts, and topics are neglected. In this paper, we propose to model podcast content without labels and learn general podcast representations without prior knowledge of downstream tasks. Moreover, the learned podcast representations encode crucial structural information, complementary to the independent content information of each podcast. In particular, we first collect a new and large-scale podcast graph from Spotify. Then, we propose Podcast2Vec, a novel self-supervised podcast content modeling method to learn podcast representations. Podcast2Vec captures general transferable knowledge across different tasks and complex structures via a metapath-based neighbor sampling strategy and a multi-view relational modeling framework. Thorough experiments demonstrate the superiority of our method on four real-world podcast content modeling tasks.

Related

November 2024 | SIAM Journal on Mathematics of Data Science

Topological Fingerprints for Audio Identification

Wojciech Reise, Ximena Fernández, Maria Dominguez, Heather A. Harrington, Mariano Beguerisse-Díaz

June 2024 | ICWSM

Socially-Motivated Music Recommendation

Ben Lacker, Samuel Way

May 2024 | The Web Conference

Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

Marco De Nadai, Francesco Fabbri, Paul Gigioli, Alice Wang, Ang Li, Fabrizio Silvestri, Laura Kim, Shawn Lin, Vladan Radosavljevic, Sandeep Ghael, David Nyhan, Hugues Bouchard, Mounia Lalmas-Roelleke, Andreas Damianou