Using Survival Models to Estimate Long-Term Engagement in Online Experiments
Online controlled experiments, in which different variants of a product are compared based on an Overall Evaluation Criterion (OEC), have emerged as a gold standard for decision making in online services. It is vital that the OEC is aligned with the overall goal of stakeholders for effective decision making. However, this is a challenge when the overall goal is not immediately observable. For instance, we might want to understand the effect of deploying a feature on long-term retention, where the outcome (retention) is not observable at the end of an A/B test. In this work, we examine long-term user engagement outcomes as a time-to-event problem and demonstrate the use of survival models for estimating long-term effects. We then discuss the practical challenges in using time-to-event metrics for decision making in online experiments. We propose a simple churn-based time-to-inactivity metric and describe a framework for developing & validating modeled metrics using survival models for predicting long-term retention. Then, we present a case study and provide practical guidelines on developing and evaluating a time-to-churn metric on a large scale real-world dataset of online experiments. Finally, we compare the proposed approach to existing alternatives in terms of sensitivity and directionality.