In machine learning, fewer features reduce model complexity. Carefully assessing the influence of each input feature on the model quality is therefore a crucial preprocessing step. We propose a novel feature selection algorithm based on a quadratic unconstrained binary optimization (QUBO) problem, which allows to select a specified number of features based on their importance and redundancy. In contrast to iterative or greedy methods, our direct approach yields higherquality solutions. QUBO problems are particularly interesting because they can be solved on quantum hardware. To evaluate our proposed algorithm, we conduct a series of numerical experiments using a classical computer, a quantum gate computer and a quantum annealer. Our evaluation compares our method to a range of standard methods on various benchmark datasets. We observe competitive performance.
翻译:在机器学习中, 较少的特性会降低模型复杂性。 因此, 仔细评估每个输入特性对模型质量的影响是一个关键的预处理步骤。 我们提议基于四级不受限制的二进制优化(QUBO)问题的新特性选择算法, 允许根据它们的重要性和冗余性选择特定数量的特性。 与迭代或贪婪的方法相比, 我们的直接方法可以产生更高质量的解决方案。 QUBO问题特别有趣, 因为它们可以在量子硬件上得到解决。 为了评估我们提议的算法, 我们用古典计算机、 量子门计算机和量子麻醉器进行一系列数字实验。 我们的评价将我们的方法比作各种基准数据集的一系列标准方法。 我们观察了竞争业绩。