In recent years, research on near-term quantum machine learning has explored how classical machine learning algorithms endowed with access to quantum kernels (similarity measures) can outperform their purely classical counterparts. Although theoretical work has shown provable advantage on synthetic data sets, no work done to date has studied empirically whether quantum advantage is attainable and with what kind of data set. In this paper, we report the first systematic investigation of empirical quantum advantage (EQA) in healthcare and life sciences and propose an end-to-end framework to study EQA. We selected electronic health records (EHRs) data subsets and created a configuration space of 5-20 features and 200-300 training samples. For each configuration coordinate, we trained classical support vector machine (SVM) models based on radial basis function (RBF) kernels and quantum models with custom kernels using an IBM quantum computer, making this one of the largest quantum machine learning experiments to date. We empirically identified regimes where quantum kernels could provide advantage on a particular data set and introduced a terrain ruggedness index, a metric to help quantitatively estimate how the accuracy of a given model will perform as a function of the number of features and sample size. The generalizable framework introduced here represents a key step towards a priori identification of data sets where quantum advantage could exist.
翻译:近年来,近期量子机器学习研究探索了获得量子内核(类似措施)的经典机器学习算法如何能优于纯古典同类算法(类似措施),尽管理论工作表明合成数据集具有可证实的优势,但迄今没有开展任何工作,对量子优势能否实现和何种数据集进行了经验性研究。在本文件中,我们报告了首次系统调查保健和生命科学领域的经验性量子优势(EQA),并提出了研究EQA的端至端框架。我们选择了电子健康记录(EHR)数据子集,并创建了一个5-20个特征和200-300个培训样本的配置空间。我们为每个配置协调局,我们培训了基于辐射基函数(RBF)的古典支持矢量机模型和量子模型模型,使用IBM量子计算机与定制内核的定制内核优势(EQQQQA)首次进行了系统性调查,这是迄今为止最大的一次量子机器学习实验。我们从经验上确定了量子内核能为特定数据集提供优势的制度,并引入了地形坚固指数指数,这是帮助量化地测定了先定比例的模型。