The work in ICML'09 showed that the derivatives of the classical multi-class logistic regression loss function could be re-written in terms of a pre-chosen "base class" and applied the new derivatives in the popular boosting framework. In order to make use of the new derivatives, one must have a strategy to identify/choose the base class at each boosting iteration. The idea of "adaptive base class boost" (ABC-Boost) in ICML'09, adopted a computationally expensive "exhaustive search" strategy for the base class at each iteration. It has been well demonstrated that ABC-Boost, when integrated with trees, can achieve substantial improvements in many multi-class classification tasks. Furthermore, the work in UAI'10 derived the explicit second-order tree split gain formula which typically improved the classification accuracy considerably, compared with using only the fist-order information for tree-splitting, for both multi-class and binary-class classification tasks. In this paper, we develop a unified framework for effectively selecting the base class by introducing a series of ideas to improve the computational efficiency of ABC-Boost. Our framework has parameters $(s,g,w)$. At each boosting iteration, we only search for the "$s$-worst classes" (instead of all classes) to determine the base class. We also allow a "gap" $g$ when conducting the search. That is, we only search for the base class at every $g+1$ iterations. We furthermore allow a "warm up" stage by only starting the search after $w$ boosting iterations. The parameters $s$, $g$, $w$, can be viewed as tunable parameters and certain combinations of $(s,g,w)$ may even lead to better test accuracy than the "exhaustive search" strategy. Overall, our proposed framework provides a robust and reliable scheme for implementing ABC-Boost in practice.
翻译:ICML'09 中的工作显示, 经典多级物流回归损失函数的衍生物可以用预选的美元“ 基础级” 来重写, 并在大众推进框架中应用新的衍生物。 为了使用新的衍生物, 人们必须有一个战略来识别/ 选择每个推进迭代的基级。 ICML'09 中的“ 适应性基础级提升” (ABC- Boost) 理念, 采用了一个计算成本昂贵的“ 充分搜索” 战略, 用于每迭代中的基础级。 已经很好地证明, ABC- Boost 与树结合后, ABC- Boost 可以实现大幅改进许多多级的衍生物 。 UAAI'10 中的工作产生了明确的二级树分化公式, 通常会大大提高分类的准确性, 与只使用“ 拳阶” 分立信息, 用于多级和二等分级的分类任务。 在本文中, 我们开发一个统一的框架, 能够有效地选择基础类, 通过引入一系列的想法来提高 美元 水平的搜索 。