Learning Personalised Prices in Ad Auctions with Game Theory and Deep Learning

We introduce a new way to set personalized reservation prices in impression-based ad auctions. By combining game theory and deep neural networks, we can infer advertisers’ hidden willingness to pay directly from their bidding data, and we use this information to optimize auction prices. Accurately estimating and acting on advertisers’ true valuation is crucial, as even small improvements in price setting can translate into substantial revenue gains and more efficient market outcomes. Tested on 100,000 real auctions, our approach allows for a significant increase of the expected auction revenue by an average of +4% across ten markets.
Why reservation prices matter
To sustain free access to large-scale digital content such as music, podcasts, and audiobooks, many streaming platforms rely on advertising as a core source of revenue. These ads are typically allocated through real-time auctions, where every time a free-tier user is served an ad, advertisers submit bids for exposure, and an algorithm determines which ad is shown, in what order, and at what price.
In these impression-based auctions, reservation prices (the minimum bids required for participation) play a critical role in balancing efficiency and profitability. If set too low, the platform leaves potential revenue unrealized; if set too high, too few ads are served, reducing the overall efficiency of the marketplace.
While the design of reservation prices has been widely studied in cost-per-click (CPC) systems such as sponsored search, the cost-per-impression (CPM) model used in streaming and other attention-based media poses distinct challenges. In this setting, users typically engage passively rather than interactively, and their intent is less explicit. At the same time, advertisers’ preferences, such as targeting specific demographics, contexts, or regions, can be highly complex and often remain unobservable by the auctioneer.
This raises a fundamental question: how can we set optimal reservation prices when advertisers’ true valuations and targeting preferences are only indirectly observed through their bidding behavior?
A game inside of a learning problem
In this work, we start by deriving the symmetric Nash equilibrium for generalized second-price (GSP) auctions in the CPM setting. This equilibrium establishes a formal relationship between an advertiser’s true but unobservable value (i.e. how much they are willing to pay per impression) and their observable bid. It thus provides the game-theoretic foundation for our approach.
Building on this theoretical link, we frame the price-setting task as a learning problem. Given only observed bids and user-level features, can we infer the underlying value distributions of advertisers, and, once inferred, use them to recompute the reservation prices that maximize auction revenue in equilibrium?
To address this, we introduce a Mixture Density Network (MDN), a type of deep neural network that predicts the parameters of full value distributions rather than single point estimates. In our setting, the MDN captures advertisers’ heterogeneous preferences, for instance, differences between video and audio advertisers, or across user segments such as age or geography.
Our main innovation is to embed the Nash equilibrium directly into the network’s loss function. This integration ensures that the model respects the strategic constraints of the auction while learning from real bidding behavior. In essence, the model learns the latent value distributions that would rationally generate the observed bids.
Optimising prices in equilibrium
Once the advertisers’ value distributions are learned, we simulate the auction under different reservation prices using a Monte Carlo approach. Since the interactions between advertisers are complex and hard to model analytically, Monte Carlo sampling offers a straightforward way to approximate expected revenues. This allows us to estimate the equilibrium revenue and use gradient-based optimization to find the reservation prices that maximize it.
Importantly, our framework supports personalization: reservation prices can be conditioned on both user features (such as age, location, or market) and advertiser characteristics. This enables fine-grained price differentiation, allowing the auction mechanism to adapt dynamically to the diversity of users and advertisers.
Results: Real-world data and +4% revenue lift
We evaluated our approach on 100,000 real ad auctions from Spotify, a major music streaming platform, covering ten international markets. Across all markets, our model consistently outperformed standard baselines, including:
Zero reserve prices (no minimum bid),
Myerson prices (theoretical uniform optimum),
Empirical revenue maximization, and
Sampling-based mechanisms.
On average, personalized reservation prices led to a +4% increase in expected revenue, with improvements of up to +11.8% in some markets (see Figure 2).
Beyond the aggregate lift, our analysis revealed two key patterns:
Market competitiveness matters: the more bidders there are per auction, the smaller the relative gains from setting reservation prices, as strong competition already keeps bids high.
Preference diversity helps: the more heterogeneous advertisers’ preferences are (e.g., valuing different user segments or formats), the larger the benefit from personalization. In such markets, individualized pricing delivers the greatest uplift.
Figure 1. Left: Distribution of the number of bidders per auction. Right: Densities of the values of bidders of different types (ground truth and estimated with our model)
Figure 2. Revenue surfaces and optimal reservation prices (normalised) in two real markets for the users of two types (left, center). Right: uniform (homogenous) case.
Why this matters
Ad auctions are the economic engines behind many digital ecosystems, from search and social media to streaming and retail. While the theoretical foundations of auction design are well established, applying them in today’s high-dimensional, data-rich environments presents a new challenge: it requires bridging economic theory with machine learning.
Our work demonstrates that this bridge is not only possible but powerful. By embedding the structure of equilibrium theory directly into a neural model, we combine the rigor of economics with the flexibility of deep learning. This integration allows the system to learn from real bidding behavior while maintaining interpretability and producing actionable, explainable pricing policies grounded in strategic reasoning.
Looking ahead
Personalized pricing mechanisms open new opportunities to make ad markets more efficient, adaptive, and responsive to both user and advertiser behavior. Promising directions for future work include:
Extending the approach to dynamic settings, where advertisers learn and adapt their strategies over time,
Incorporating uncertainty estimates to support more robust and risk-aware pricing decisions.
As digital advertising expands beyond clicks into richer, multi-format ecosystems such as audio, video, and interactive media, the ability to understand and learn from strategic behavior will be central to designing fair, efficient, and transparent marketplaces.
For more information please check our paper:Learning Optimal Personalised Reservation Prices in Impression Ad Auctions with Mixture Density Networks Dmitrii Moor, Emma Zetterdahl, Paul van Vliet, Zhenwen Dai, and Mounia Lalmas CIKM 2025, Seoul



