Feature selection is of great importance in Machine Learning, where it can be used to reduce the dimensionality of classification, ranking and prediction problems. The removal of redundant and noisy features can improve both the accuracy and scalability of the trained models. However, feature selection is a computationally expensive task with a solution space that grows combinatorically. In this work, we consider in particular a quadratic feature selection problem that can be tackled with the Quantum Approximate Optimization Algorithm (QAOA), already employed in combinatorial optimization. First we represent the feature selection problem with the QUBO formulation, which is then mapped to an Ising spin Hamiltonian. Then we apply QAOA with the goal of finding the ground state of this Hamiltonian, which corresponds to the optimal selection of features. In our experiments, we consider seven different real-world datasets with dimensionality up to 21 and run QAOA on both a quantum simulator and, for small datasets, the 7-qubit IBM (ibm-perth) quantum computer. We use the set of selected features to train a classification model and evaluate its accuracy. Our analysis shows that it is possible to tackle the feature selection problem with QAOA and that currently available quantum devices can be used effectively. Future studies could test a wider range of classification models as well as improve the effectiveness of QAOA by exploring better performing optimizers for its classical step.
翻译:在机器学习中, 选择地物非常重要, 它可用于降低分类、 排名和预测问题的维度。 去除冗余和吵杂的特性可以提高经过训练的模型的准确性和可缩放性。 然而, 特性选择是一项计算成本很高的任务, 其解决方案空间会增长交织。 在这项工作中, 我们特别考虑到一个二次特征选择问题, 可以通过量子模拟器( QAOA) 来解决, 已经在组合优化中使用。 首先, 我们代表了QUB 配方的特征选择问题, 该配方随后被映射到一个旋转的汉密尔顿仪。 然后我们应用QAOA 来寻找这个汉密尔顿模型的地面状态, 与最佳地貌选择相匹配。 在我们的实验中, 我们考虑七个不同的真实世界数据集, 其维度可高达21, 并在一个量子模拟器( QA) 上运行QA, 用来通过一个更精确的精确度模型( ibeptime) QA ), 来有效地进行我们所选的精确度分析。