Exploration-Exploitation Problem in Recommender Systems: Leveraging Uncertainty for Better Recommendations

In the world of recommender systems, the exploration-exploitation problem is a constant challenge. How do we balance recommending items that we know perform well with exploring new, potentially better options? In a joint post with Inbar Naor, we explore how uncertainty can help us solve this problem.

At Taboola, our goal is to recommend items that users will find relevant. This relevance is measured through Click Through Rate (CTR) — the probability of a user clicking on a recommended item. However, we don’t have perfect information about the CTR of all items. This is where uncertainty comes into play.

We face a familiar scenario — like choosing an ice cream flavor out of many options. Do we stick with our favorite, known flavor, or do we explore new, unknown flavors in search of a better option? These strategies — exploitation and exploration — are essential in recommender systems as well.

One simple approach to tackling this challenge is the ϵ-greedy algorithm, where a percentage of traffic is allocated to exploring new items randomly. The rest is reserved for exploiting known, high-performing items. This serves as a baseline for more sophisticated methods.

One such method is the Upper Confidence Bound (UCB) algorithm, which uses uncertainty to determine which items to recommend. By balancing the expected CTR with the confidence bound, UCB guides us towards exploring new items with potential high performance. Thompson Sampling is another approach that incorporates the entire estimated distribution of an item’s CTR.

However, in a dynamic environment like Taboola where new items enter and leave the system daily, relying solely on empirical data may not be sufficient. We need a way to estimate the CTR of new items without showing them to users. This is where neural networks come into play, allowing us to use model estimations to make informed decisions about exploring new items with uncertainty.

To measure the success of our exploration efforts, we use an exploration throughput metric at Taboola. This metric helps us evaluate how well different models are exploring new items while maintaining good performance. We found that models using the UCB approach strike a good balance between exploring new items and recommending high-performing items.

The exploration-exploitation problem remains a fascinating challenge for many companies in the recommender systems domain. By leveraging uncertainty and advanced algorithms like UCB, we can continue to improve our recommendation systems and provide the best service to users. Stay tuned for our next post where we’ll dive deeper into the models used for estimating CTR and uncertainty.

Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

VOXI UK Launches First AI Chatbot to Support Customers

Discovering the Uncharted Territories through Embracing Uncertainty

Exploration-Exploitation Problem in Recommender Systems: Leveraging Uncertainty for Better Recommendations

Latest

Dashboard for Analyzing Medical Reports with Amazon Bedrock, LangChain, and Streamlit

Broadcom and OpenAI Collaborating on a Custom Chip for ChatGPT

Xborg Robotics Introduces Advanced Whole-Body Collaborative Industrial Solutions at the Hong Kong Electronics Fair (Autumn Edition)

How AI is Revolutionizing Data, Decision-Making, and Risk Management

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

VOXI UK Launches First AI Chatbot to Support Customers

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Microsoft launches new AI tool to assist finance teams with generative tasks

Dashboard for Analyzing Medical Reports with Amazon Bedrock, LangChain, and Streamlit

Tailoring Text Content Moderation Using Amazon Nova

Building a Secure MLOps Platform Using Terraform and GitHub

Popular categories

Most recent

Dashboard for Analyzing Medical Reports with Amazon Bedrock, LangChain, and Streamlit

Broadcom and OpenAI Collaborating on a Custom Chip for ChatGPT

Xborg Robotics Introduces Advanced Whole-Body Collaborative Industrial Solutions at the Hong Kong Electronics Fair (Autumn Edition)

Most popular

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

VOXI UK Launches First AI Chatbot to Support Customers

Subscribe