TY - GEN
T1 - Parallel and synchronized UCB2 for online recommendation systems
AU - Rahman, Mahmuda
AU - Oh, Jae C.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/2/2
Y1 - 2016/2/2
N2 - As users' preferences shift continuously, recommendation system has to learn quickly from them. It is an interesting online learning problem as recommender does not have any prior knowledge about the distribution of items over the users. In this work, we generate a small recommendation set from a large number of items, with an intention that at least one of recommended items would satisfy the user and thus minimize user abandonment. We used multiarmed bandit algorithm for this purpose and avail multiple instances of Upper Confidence Bound2 (UCB2). Although UCB2 is theoretically proved to have a better regret bound than UCB1, unlike UCB1, it has not been used for parallel execution. We designed an efficient algorithm which runs multiple instances of UCB2 in parallel. Our algorithm suitably handles parameter synchronization, reward update and exploration decisions across multiple instances of UCB2 and ensures that they are capable of covering different types of users. While applied to real data, our method shows comparable performance over a recommendation system that runs multiple instances of UCB1 in parallel. We compared our results with Ranked Bandit Algorithm and Independent Bandit Algorithm.
AB - As users' preferences shift continuously, recommendation system has to learn quickly from them. It is an interesting online learning problem as recommender does not have any prior knowledge about the distribution of items over the users. In this work, we generate a small recommendation set from a large number of items, with an intention that at least one of recommended items would satisfy the user and thus minimize user abandonment. We used multiarmed bandit algorithm for this purpose and avail multiple instances of Upper Confidence Bound2 (UCB2). Although UCB2 is theoretically proved to have a better regret bound than UCB1, unlike UCB1, it has not been used for parallel execution. We designed an efficient algorithm which runs multiple instances of UCB2 in parallel. Our algorithm suitably handles parameter synchronization, reward update and exploration decisions across multiple instances of UCB2 and ensures that they are capable of covering different types of users. While applied to real data, our method shows comparable performance over a recommendation system that runs multiple instances of UCB1 in parallel. We compared our results with Ranked Bandit Algorithm and Independent Bandit Algorithm.
UR - http://www.scopus.com/inward/record.url?scp=85028308657&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85028308657&partnerID=8YFLogxK
U2 - 10.1109/WI-IAT.2015.164
DO - 10.1109/WI-IAT.2015.164
M3 - Conference contribution
AN - SCOPUS:85028308657
T3 - Proceedings - 2015 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2015
SP - 413
EP - 416
BT - Proceedings - 2015 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2015 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology Workshops, WI-IAT Workshops 2015
Y2 - 6 December 2015 through 9 December 2015
ER -