Abstract
We characterize the asymptotic performance of nonparametric one-and two-sample testing. The exponential decay rate or error exponent of the type-II error probability is used as the asymptotic performance metric, and an optimal test achieves the maximum rate subject to a constant level constraint on the type-I error probability. With Sanov's theorem, we derive a sufficient condition for one-sample tests to achieve the optimal error exponent in the universal setting, i.e., for any distribution defining the alternative hypothesis. We then show that two classes of Maximum Mean Discrepancy (MMD) based tests attain the optimal type-II error exponent on Rd, while the quadratic-time Kernel Stein Discrepancy (KSD) based tests achieve this optimality with an asymptotic level constraint. For general two-sample testing, however, Sanov's theorem is insufficient to obtain a similar sufficient condition. We proceed to establish an extended version of Sanov's theorem and derive an exact error exponent for the quadratic-time MMD based two-sample tests. The obtained error exponent is further shown to be optimal among all two-sample tests satisfying a given level constraint. Our work hence provides an achievability result for optimal nonparametric one-and two-sample testing in the universal setting. Application to off-line change detection and related issues are also discussed.
Original language | English (US) |
---|---|
Article number | 9354188 |
Pages (from-to) | 2074-2092 |
Number of pages | 19 |
Journal | IEEE Transactions on Information Theory |
Volume | 67 |
Issue number | 4 |
DOIs | |
State | Published - Apr 2021 |
Keywords
- Universal hypothesis testing
- error exponent
- kernel Stein discrepancy (KSD)
- large deviations
- maximum mean discrepancy (MMD)
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Library and Information Sciences