Structural Podcast Content Modeling with Generalizability

Abstract

Podcast content modeling is crucial for a variety of practical web uses, such as the recommendation and classification of podcasts. However, previous studies on podcast content modeling rely on task-specific datasets to train dedicated models for each downstream application, which are labels heavily dependent and the learned representations are non-generalizable across different tasks. In addition, the rich and intricate structural information among users, podcasts, and topics are neglected. In this paper, we propose to model podcast content without labels and learn general podcast representations without prior knowledge of downstream tasks. Moreover, the learned podcast representations encode crucial structural information, complementary to the independent content information of each podcast. In particular, we first collect a new and large-scale podcast graph from Spotify. Then, we propose Podcast2Vec, a novel self-supervised podcast content modeling method to learn podcast representations. Podcast2Vec captures general transferable knowledge across different tasks and complex structures via a metapath-based neighbor sampling strategy and a multi-view relational modeling framework. Thorough experiments demonstrate the superiority of our method on four real-world podcast content modeling tasks.

View publication

Structural Podcast Content Modeling with Generalizability

Abstract

Related

PulseSearch: Modeling Long- and Short-term Patterns for Personalized Music Search Suggestions

Cold-Starting Podcast Ads and Promotions with Multi-Task Learning on Spotify

Calibrated Recommendations with Contextual Bandits

Evaluating Podcast Recommendations with Profile-Aware LLM-as-a-Judge