CatBoost is a popular machine learning library. CatBoost models are based on oblivious decision trees, making training and evaluation rapid. CatBoost has many applications, and some require low latency and high throughput evaluation. This paper investigates the possibilities for improving CatBoost's performance in single-core CPU computations. We explore the new features provided by the AVX instruction sets to optimize evaluation. We increase performance by 20-40% using AVX2 instructions without quality impact. We also introduce a new trade-off between speed and quality. Using float16 for leaf values and AVX-512 instructions, we achieve 50-70% speed-up.
翻译:CatBoost 是一个受欢迎的机器学习图书馆。 CatBoost 模型基于隐蔽的决策树,使培训和评估迅速。 CatBoost 有许多应用程序,有些应用程序需要低潜值和高输送量评估。 本文调查了在单核心CPU计算中提高 CatBoost 性能的可能性。 我们探索了AVX 指令集提供的新功能,以优化评估。 我们使用 AVX2 指令在不产生质量影响的情况下将性能提高20-40%。 我们还引入了在速度和质量之间的新的权衡。 使用浮点16 用于叶值和 AVX-512 指令, 我们实现了50- 70%的加速。