Modeling Language Usage and Listener Engagement in Podcasts

Abstract

While there is an abundance of advice to podcast creators on how to speak in ways that engage their listeners, there has been little data-driven analysis of podcasts that relates linguistic style with engagement. In this paper, we investigate how various factors – vocabulary diversity, distinctiveness, emotion, and syntax, among others – correlate with engagement, based on analysis of the creators’ written descriptions and transcripts of the audio. We build models with different textual representations, and show that the identified features are highly predictive of engagement. Our analysis tests popular wisdom about stylistic elements in high-engagement podcasts, corroborating some pieces of advice and adding new perspectives on others.

Related

April 2022 | The Web Conference (WWW)

Mostra: A Flexible Balancing Framework to Trade-off User, Artist and Platform Objectives for Music Sequencing

Emanuele Bugliarello, Rishabh Mehrotra, James Kirk, Mounia Lalmas

November 2021 | CIKM

Leveraging Semantic Information to Facilitate the Discovery of Underserved Podcasts

Maryam Aziz, Alice Wang, Aasish Pappu, Hugues Bouchard,Yu Zhao, Benjamin Carterette and Mounia Lalmas

April 2021 | EACL

Detecting Extraneous Content in Podcasts

Sravana Reddy, Yongze Yu, Aasish Pappu, Aswin Sivaraman, Rezvaneh Rezapour, Rosie Jones