Voter eligibility in United States elections is determined by a patchwork of state databases containing information about which citizens are eligible to vote. Administrators at the state and local level are faced with the exceedingly difficult task of ensuring that each of their jurisdictions is properly managed, while also monitoring for improper modifications to the database. Monitoring changes to Voter Registration Files (VRFs) is crucial, given that a malicious actor wishing to disrupt the democratic process in the US would be well-advised to manipulate the contents of these files in order to achieve their goals. In 2020, we saw election officials perform admirably when faced with administering one of the most contentious elections in US history, but much work remains to secure and monitor the election systems Americans rely on. Using data created by comparing snapshots taken of VRFs over time, we present a set of methods that make use of machine learning to ease the burden on analysts and administrators in protecting voter rolls. We first evaluate the effectiveness of multiple unsupervised anomaly detection methods in detecting VRF modifications by modeling anomalous changes as sparse additive noise. In this setting we determine that statistical models comparing administrative districts within a short time span and non-negative matrix factorization are most effective for surfacing anomalous events for review. These methods were deployed during 2019-2020 in our organization's monitoring system and were used in collaboration with the office of the Iowa Secretary of State. Additionally, we propose a newly deployed model which uses historical and demographic metadata to label the likely root cause of database modifications. We hope to use this model to predict which modifications have known causes and therefore better identify potentially anomalous modifications.
翻译:美国选举的选民资格是由一套包含公民有资格投票的信息的州数据库拼凑在一起决定的。州和地方一级的行政长官面临确保各自辖区得到适当管理这一极其困难的任务,同时也要监测对数据库的不当修改。监测选民登记档案的修改至关重要,因为一个恶意行为者如果想扰乱美国的民主进程,就会很好地操纵这些文件的内容,以实现其目标。在2020年,我们看到选举官员在管理美国历史上最有争议的选举时表现得令人钦佩,但在保障和监测选举制度方面仍有大量工作要做。利用通过比较对选民登记档案的截图所创造的数据,我们提出一套方法,利用机器学习减轻分析师和行政人员保护选民名册的负担。我们首先评估多种不超超常的异常检测模式的有效性,通过模拟变异性变异性变异性,从而作为稀有的添加性噪音。我们确定统计模型比较短时间范围内的行政区,并监测美国人所依赖的选举制度的安全和监督选举制度。我们利用了一套最有希望的统计模型,在进行更短的时间范围内和最不易地利用的统计学要素审查期间,我们利用了一种更精确的统计模型来评估。我们所部署的系统,在进行更精确地利用了20年期的统计统计统计学研究时,我们使用了一种比较了一种比较了一种比较。我们所部署的模型,在使用了一种比较了一种比较了一种比较了一种比较了一种比较了一种比较了一种比较。我们所部署的模型,在使用的一种方法,在使用的一种方法,在使用了一种比较了一种比较了一种比较了一种比较了一种比较了一种比较了一种比较了一种比较了一种比较了一种较深地基质性基质性基质性的研究。