The goal of this paper is to introduce a general argumentation framework for regression in the errors-in-variables regime, allowing for full flexibility about the dimensionality of the data, error probability density types, the (linear or nonlinear) model type and the avoidance of explicit definition of loss functions. Further, we introduce in this framework model fitting for partially unpaired data, i.e. for given data groups the pairing information of input and output is lost (semi-supervised). This is achieved by constructing mixture model densities, which directly model this loss of pairing information allowing for inference. In a numerical simulation study linear and nonlinear model fits are illustrated as well as a real data study is presented based on life expectancy data from the world bank utilizing a multiple linear regression model. These results allow the conclusion that high quality model fitting is possible with partially unpaired data, which opens the possibility for new applications with unfortunate or deliberate loss of pairing information in the data.
翻译:暂无翻译