We propose a new iterative method using machine learning algorithms to fit an imprecise regression model to data that consist of intervals rather than point values. The method is based on a single-layer interval neural network which can be trained to produce an interval prediction. It seeks parameters for the optimal model that minimize the mean squared error between the actual and predicted interval values of the dependent variable using a first-order gradient-based optimization and interval analysis computations to model the measurement imprecision of the data. The method captures the relationship between the explanatory variables and a dependent variable by fitting an imprecise regression model, which is linear with respect to unknown interval parameters even the regression model is nonlinear. We consider the explanatory variables to be precise point values, but the measured dependent values are characterized by interval bounds without any probabilistic information. Thus, the imprecision is modeled non-probabilistically even while the scatter of dependent values is modeled probabilistically by homoscedastic Gaussian distributions. The proposed iterative method estimates the lower and upper bounds of the expectation region, which is an envelope of all possible precise regression lines obtained by ordinary regression analysis based on any configuration of real-valued points from the respective intervals and their x-values.
翻译:我们建议一种新的迭代方法,使用机器学习算法,使不精确的回归模型适应由间隔值而不是点值组成的数据。该方法基于一个单层间间神经网络,可以对它进行培训,以产生间隔预测。该方法寻求最佳模型的参数,以最大限度地减少依附变量实际值和预测间隔值之间平均平方差差差差的优化和间隔分析计算,用一级梯度梯度梯度优化和间隔分析计算数据测量不精确的模型来模拟数据的测量不准确性。该方法通过安装一个不精确的回归模型来捕捉解释变量和依附变量之间的关系,该模型对即使是回归模型也以未知的间隙参数为线性。我们认为解释变量是精确的点值,但测量的依附值是间隔线的特征,没有任何概率性信息。因此,不精确的偏差是模拟非概率的模型,即使依赖值的散落差是用同性标定的标值标值的标值。提议的迭代法方法估计预期区域的下和上层界限,这是用所有可能的精确的折价定的回归线,以普通的回归分析为基础。