Exploration-Exploitation Problem in Recommender Systems: Leveraging Uncertainty for Better Recommendations

In the world of recommender systems, the exploration-exploitation problem is a constant challenge. How do we balance recommending items that we know perform well with exploring new, potentially better options? In a joint post with Inbar Naor, we explore how uncertainty can help us solve this problem.

At Taboola, our goal is to recommend items that users will find relevant. This relevance is measured through Click Through Rate (CTR) — the probability of a user clicking on a recommended item. However, we don’t have perfect information about the CTR of all items. This is where uncertainty comes into play.

We face a familiar scenario — like choosing an ice cream flavor out of many options. Do we stick with our favorite, known flavor, or do we explore new, unknown flavors in search of a better option? These strategies — exploitation and exploration — are essential in recommender systems as well.

One simple approach to tackling this challenge is the ϵ-greedy algorithm, where a percentage of traffic is allocated to exploring new items randomly. The rest is reserved for exploiting known, high-performing items. This serves as a baseline for more sophisticated methods.

One such method is the Upper Confidence Bound (UCB) algorithm, which uses uncertainty to determine which items to recommend. By balancing the expected CTR with the confidence bound, UCB guides us towards exploring new items with potential high performance. Thompson Sampling is another approach that incorporates the entire estimated distribution of an item’s CTR.

However, in a dynamic environment like Taboola where new items enter and leave the system daily, relying solely on empirical data may not be sufficient. We need a way to estimate the CTR of new items without showing them to users. This is where neural networks come into play, allowing us to use model estimations to make informed decisions about exploring new items with uncertainty.

To measure the success of our exploration efforts, we use an exploration throughput metric at Taboola. This metric helps us evaluate how well different models are exploring new items while maintaining good performance. We found that models using the UCB approach strike a good balance between exploring new items and recommending high-performing items.

The exploration-exploitation problem remains a fascinating challenge for many companies in the recommender systems domain. By leveraging uncertainty and advanced algorithms like UCB, we can continue to improve our recommendation systems and provide the best service to users. Stay tuned for our next post where we’ll dive deeper into the models used for estimating CTR and uncertainty.

Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Discovering the Uncharted Territories through Embracing Uncertainty

Exploration-Exploitation Problem in Recommender Systems: Leveraging Uncertainty for Better Recommendations

Latest

Real-Time Voice Agents Using Stream Vision Agents and Amazon Nova 2 Sonic

Go.Compare Introduces Insurance App Powered by ChatGPT

Dstl-Backed Robotics Innovation Revolutionizes Military Manufacturing – A Case Study

Understanding Patient Sentiment in Atopic Dermatitis Management

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

VOXI UK Launches First AI Chatbot to Support Customers

Real-Time Voice Agents Using Stream Vision Agents and Amazon Nova 2...

Create Financial Document Processing Solutions Using Pulse AI and Amazon Bedrock

Automating Schema Creation for Smart Document Processing

Popular categories

Most recent

Real-Time Voice Agents Using Stream Vision Agents and Amazon Nova 2 Sonic

Go.Compare Introduces Insurance App Powered by ChatGPT

Dstl-Backed Robotics Innovation Revolutionizes Military Manufacturing – A Case Study

Most popular

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Running Your ML Notebook on Databricks: A Step-by-Step Guide

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Subscribe