Search engine queries have been demonstrated to be a useful signal for screening people for different cancer types. Past work focused on a biased population which indicated that they were suffering from the condition, or else inferred which people had the condition of interest using their queries. Here we used a combination of an online advertising campaign and a clinically verified questionnaire to identify at-risk people, and correlated their past queries with these risk scores. People who suspected they were suffering from lung, breast, or colon cancer were recruited through ads shown on the Bing search engine to complete a clinically verified risk questionnaire. Of those, 201 people agreed to participate in the research and their past queries could be obtained. An automated classifier to predict their risk score based on past queries reached an Area Under the ROC (AUC) of 0.64 for all cancers, and 0.76 for colon cancer. These results demonstrate the utility of search engine queries to screen for cancer and are the represent the first step in utilizing advertising systems to screen for cancer and detect it earlier than has been previously possible.
翻译:事实证明,搜索引擎查询是筛选不同类型癌症患者的有用信号; 过去的工作侧重于有偏见的人群,表明他们患有这种疾病,或者用他们的查询推断出哪些人感兴趣; 我们在此使用在线广告运动和经临床核实的问卷的结合来识别高危人群,并将他们过去询问与这些风险分数挂钩; 怀疑他们患有肺癌、乳腺癌或结肠癌的人是通过在Bing搜索引擎上显示的广告招聘的,以完成经临床核实的风险问卷。 在这些人中,201人同意参与研究,并可以获得他们过去的查询; 根据以往查询,自动分类者根据以往查询,预测其风险得分达到0.64个所有癌症的ROC(AAC)下区域,0.76个癌症的ROC(ACC)下区域,这些结果显示搜索引擎查询对癌症筛查的效用,是利用广告系统来筛查癌症并比以前可能提前检测的第一个步骤。