利用公共电视机学习模型预测美国大都市地区Zip标准疫苗疫苗安全水平 (Predicting Zip Code-Level Vaccine Hesitancy in US Metropolitan Areas Using Machine Learning Models on Public Tweets)

Although the recent rise and uptake of COVID-19 vaccines in the United States has been encouraging, there continues to be significant vaccine hesitancy in various geographic and demographic clusters of the adult population. Surveys, such as the one conducted by Gallup over the past year, can be useful in determining vaccine hesitancy, but can be expensive to conduct and do not provide real-time data. At the same time, the advent of social media suggests that it may be possible to get vaccine hesitancy signals at an aggregate level (such as at the level of zip codes) by using machine learning models and socioeconomic (and other) features from publicly available sources. It is an open question at present whether such an endeavor is feasible, and how it compares to baselines that only use constant priors. To our knowledge, a proper methodology and evaluation results using real data has also not been presented. In this article, we present such a methodology and experimental study, using publicly available Twitter data collected over the last year. Our goal is not to devise novel machine learning algorithms, but to evaluate existing and established models in a comparative framework. We show that the best models significantly outperform constant priors, and can be set up using open-source tools.

翻译：虽然美国最近COVID-19疫苗的上升和采用令人鼓舞,但在成人人口的各种地理和人口组别中仍然存在着严重的疫苗犹豫不决现象,例如加洛普在过去一年中进行的调查可以有助于确定疫苗犹豫不决,但进行这种调查的费用可能很高,而且不能提供实时数据。与此同时,社交媒体的出现表明,有可能通过使用机器学习模型和公开来源的社会经济(和其他)特征,获得总水平(例如拉链码水平)的疫苗失灵信号。目前,这种努力是否可行以及这种努力如何与仅使用以往不变的基线相比较是一个未决问题。据我们所知,没有提出使用真实数据的适当方法和评价结果。在本篇文章中,我们提出这样一种方法和实验性研究,使用去年收集的公开的Twitter数据。我们的目标不是设计新的机器学习算法,而是在比较框架内评价现有和既定的模型。我们表明,最佳模型大大超越了以往的源源源不断使用的工具,可以建立。

相关内容

Machine Learning

关注 2245

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【微软】深度学习概述，65页ppt，A gentle introduction to Deep Learning

专知会员服务

66+阅读 · 2020年5月17日