Evaluation

All of Spotify’s research and engineering effort depends on metrics, experimentation, and evaluation, both online and offline. Research in online testing is essential for deploying tests and interpreting results. Research in offline testing is important at Spotify for its influence on models and A/B test decisions, and range from using offline test collections, historical log data, counterfactual analysis, and metrics developed using mixed quantitative and qualitative methods approaches to understanding and validating user behavior. In addition, Spotify is invested in contributing to academic research by releasing open test collections via competitions such as the RecSys Challenge and the WSDM Cup.

Latest Evaluation Publications

April 2024 | The Web Conference

Long-term off-policy evaluation and learning

Yuta Saito, Himan Abdollahpouri, Jesse Anderton, Ben Carterette, Mounia Lalmas

April 2022 | The Web Conference (WWW)

Using Survival Models to Estimate Long-Term Engagement in Online Experiments

Praveen Chandar, Brian St. Thomas, Lucas Maystre, Vijay Pappu, Roberto Sanchis-Ojeda, Tiffany Wu, Ben Carterette, Mounia Lalmas, Tony Jebara

December 2020 | NeuRIPS

Model Selection for Production System via Automated Online Experiments

Zhenwen Dai, Praveen Chandar, Ghazal Fazelnia, Benjamin Carterette, Mounia Lalmas