Generalized User Representations for Large-Scale Recommendations and Downstream Tasks

Abstract

Accurately capturing diverse user preferences at scale is a core challenge for large-scale recommender systems like Spotify’s, given the complexity and variability of user behavior. To address this, we propose a two-stage framework that combines representation learning and transfer learning to produce generalized user embeddings. In the first stage, an autoencoder compresses rich user features into a compact latent space. In the second, task-specific models consume these embeddings via transfer learning, removing the need for manual feature engineering. This approach enhances flexibility by allowing dynamic updates to input features, enabling near-real-time responsiveness. The framework has been deployed in production at Spotify with an efficient infrastructure that allows downstream models to operate independently. Extensive online experiments in a live setting show significant improvements in metrics such as consumption share, content discovery, and search success. Additionally, our method achieves these gains while substantially reducing infrastructure costs.

View publication