Evaluation

All of Spotify’s research and engineering effort depends on metrics, experimentation, and evaluation, both online and offline. Research in online testing is essential for deploying tests and interpreting results. Research in offline testing is important at Spotify for its influence on models and A/B test decisions, and range from using offline test collections, historical log data, counterfactual analysis, and metrics developed using mixed quantitative and qualitative methods approaches to understanding and validating user behavior. In addition, Spotify is invested in contributing to academic research by releasing open test collections via competitions such as the RecSys Challenge and the WSDM Cup.

Latest Evaluation Publications

September 2020 | RecSys

Inferring the Causal Impact of New Track Releases on Music Recommendation Platforms through Counterfactual Predictions

Rishabh Mehrotra, Prasanta Bhattacharya, Mounia Lalmas

August 2020 | KDD

Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions

Praveen Chandar, James McInerney, Brian Brost, Rishabh Mehrotra, Benjamin Carterette

July 2020 | EC

The Engagement-Diversity Connection: Evidence from a Field Experiment on Spotify

David Holtz, Benjamin Carterette, Praveen Chandar, Zahra Nazari, Henriette Cramer, Sinan Aral