Simultaneous variable selection and robust data fitting are important aspects of many mathematical modelling projects and a wide array of optimisation tools and techniques exist to support them. When the intention is to embed this capability in run-time interactive decision support tools running hundreds of such modelling tasks simultaneously on a GPU, the choices of implementation approach are more limited. Recently, simple and fast Coordinate Descent algorithms have been pro- posed which can implement the LASSO approach to variable selection in conjunction with ordinary least squares (OLS) data fitting. However extending this to use the more robust Least Absolute Deviation (LAD) data fitting has been hampered by the multiple axis wise local minima that occur in the objective function for this LAD-LASSO approach. This paper suggests that these multiple axis wise local minima form a locus which is monotonic in all the axes and that this locus has a convex objective function. Hence allowing the locus to be searched using a ternary chop algorithm that uses Coordinate Descent to identify multiple local minima (points on this locus) as required to find the global minimum. The resulting algorithm is very simple making it practical to implement it as a single thread on a GPU. This opens up the possibility of running many hundreds of such threads in parallel using coarse parallelisation [2]. These are early results in a wider project to explore the use of combi- natorial sub sets of data in interactive mathematical modelling support frameworks.
翻译:在许多数学建模项目中,同时进行变量选择与鲁棒数据拟合是至关重要的环节,已有大量优化工具与技术为此提供支持。然而,当需要将此类功能嵌入运行时交互式决策支持工具,并在GPU上同时运行数百个此类建模任务时,可选的实现方案则较为有限。近期提出的简易快速坐标下降算法,能够结合普通最小二乘数据拟合实现LASSO变量选择方法。但将其扩展至使用更鲁棒的最小绝对偏差数据拟合时,却因LAD-LASSO目标函数中存在的多轴局部极小值而受阻。本文指出,这些多轴局部极小值构成一个在所有轴上均单调的轨迹,且该轨迹具有凸目标函数。因此可通过三元分割算法搜索该轨迹,并利用坐标下降识别多个局部极小值(轨迹上的点),从而找到全局极小值。所得算法极为简洁,适合在GPU上以单线程形式实现。这为通过粗粒度并行化同时运行数百个此类并行线程提供了可能。本研究是探索在交互式数学建模支持框架中使用数据组合子集的更广泛项目中的初步成果。