All of Spotify’s research and engineering effort depends on metrics, experimentation, and evaluation, both online and offline. Research in online testing is essential for deploying tests and interpreting results. Research in offline testing is important at Spotify for its influence on models and A/B test decisions, and range from using offline test collections, historical log data, counterfactual analysis, and metrics developed using mixed quantitative and qualitative methods approaches to understanding and validating user behavior. In addition, Spotify is invested in contributing to academic research by releasing open test collections via competitions such as the RecSys Challenge and the WSDM Cup.

Latest Evaluation Publications

December 2020 | NeuRIPS

Model Selection for Production System via Automated Online Experiments

Zhenwen Dai, Praveen Chandar, Ghazal Fazelnia, Benjamin Carterette, Mounia Lalmas

September 2020 | RecSys

Inferring the Causal Impact of New Track Releases on Music Recommendation Platforms through Counterfactual Predictions

Rishabh Mehrotra, Prasanta Bhattacharya, Mounia Lalmas

August 2020 | KDD

Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions

Praveen Chandar, James McInerney, Brian Brost, Rishabh Mehrotra, Benjamin Carterette