## Abstract

We study a recommendation system problem, in which the system must be able to cover as many users’ preferences as possible while these preferences change over time. This problem can be formulated as a variation of the maximum coverage problem; specifically we introduced a novel problem of Online k-Hitting Set, where the number of sets and elements within the sets can change dynamically. When the number of distinctive elements is large, an exhaustive search for even a fixed number of elements is known to be computationally expensive. Even the static problem is known to be NP-hard (Hochba, ACM SIGACT News 28(2):40–52, 1997) and many known algorithms tend to have exponential growth in complexity. We propose a novel graph based UCB1 algorithm that effectively minimizes the number of elements to consider, thereby reducing the search space greatly. The algorithm utilizes a new rewarding scheme to choose items that satisfy more users by balancing coverage and diversity as it construct a relational graph between items to recommend. Experiments show that the new graph based algorithm performs better than existing techniques such as Ranked Bandit (Radlinski et al. 2008) and Independent Bandits (Kohli et al. 2013) in terms of satisfying diverse types of users while minimizing computational complexity.

Original language | English (US) |
---|---|

Pages (from-to) | 1-17 |

Number of pages | 17 |

Journal | Applied Intelligence |

DOIs | |

State | Accepted/In press - Jun 16 2017 |

## Keywords

- Directed graph
- Diversity
- Multi armed bandit
- Online learning
- Recommendation system
- Upper confidence bound

## ASJC Scopus subject areas

- Artificial Intelligence