With the rapid development of modern technology, massive amounts of data with complex pattern are generated. Gaussian process models that can easily fit the non-linearity in data become more and more popular nowadays. It is often the case that in some data only a few features are important or active. However, unlike classical linear models, it is challenging to identify active variables in Gaussian process models. One of the most commonly used methods for variable selection in Gaussian process models is automatic relevance determination, which is known to be open-ended. There is no rule of thumb to determine the threshold for dropping features, which makes the variable selection in Gaussian process models ambiguous. In this work, we propose two variable selection algorithms for Gaussian process models, which use the artificial nuisance columns as baseline for identifying the active features. Moreover, the proposed methods work for both regression and classification problems. The algorithms are demonstrated using comprehensive simulation experiments and an application to multi-subject electroencephalography data that studies alcoholic levels of experimental subjects.
翻译:暂无翻译