2.1 Hypothesis Testing and Bayesian Inference: New Applications of Kernel Methods (Arthur Gretton)


Listens: 0

StatLearn 2012 - Workshop on "Challenging problems in Statistical Learning"


In the early days of kernel machines research, the "kernel trick" was considered a useful way of constructing nonlinear learning algorithms from linear ones, by applying the linear algorithms to feature space mappings of the original data. Recently, it has become clear that a potentially more far reaching use of kernels is as a linear way of dealing with higher order statistics, by mapping probabilities to a suitable reproducing kernel Hilbert space (i.e., the feature space is an RKHS). I will describe how probabilities can be mapped to reproducing kernel Hilbert spaces, and how to compute distances between these mappings. A measure of strength of dependence between two random variables follows naturally from this distance. Applications that make use of kernel probability embeddings include: - Nonparametric two-sample testing and independence testing in complex (high dimensional) domains. As an application, we find whether text in English is translated from the French, as opposed to being random extracts on the same topic. - Bayesian inference, in which the prior and likelihood are represented as feature space mappings, and a posterior feature space mapping is obtained. In this case, Bayesian inference can be undertaken even in the absence of a model, by learning the prior and likelihood mappings from samples.