UCL School of Management

Research seminar

Ramesh Johari, Stanford

Date

Friday, 14 September 2018
15:00 – 16:30
Location
Description

UCL School of Management is delighted to welcome Ramesh Johari, Stanford, to host a seminar discussing “Bandit Learning with Positive Externalities”.

Abstract

In many platforms, user arrivals exhibit a self-reinforcing behavior: future user arrivals are likely to have preferences similar to users who were satisfied in the past. In other words, arrivals exhibit positive externalities. We study multiarmed bandit (MAB) problems with positive externalities. We show that the self-reinforcing preferences may lead standard benchmark algorithms such as UCB to exhibit linear regret. We develop a new algorithm, Balanced Exploration (BE), which explores arms carefully to avoid suboptimal convergence of arrivals before sufficient evidence is gathered. We also introduce an adaptive variant of BE which successively eliminates suboptimal arms. We analyze their asymptotic regret, and establish optimality by showing that no algorithm can perform better.  Joint with Virag Shah and Jose Blanchet. 

Open to
PhD Programme
Staff
Cost
Free
Last updated Wednesday, 5 September 2018