Bugs, misconfiguration, and malware can cause ballot-marking devices (BMDs) to print incorrect votes. Several approaches to testing BMDs have been proposed. In logic and accuracy testing (LAT) and parallel or live testing, auditors input known test patterns into the BMD and check whether the printout matches. Passive testing monitors the rate at which voters ``spoil'' BMD printout, on the theory that if BMDs malfunction, the rate will increase. We provide theoretical lower bounds that show that in practice, these approaches cannot reliably detect outcome-altering problems. The bounds are large because: (i) The number of possible voter interactions with BMDs is enormous, so testing interactions uniformly at random is hopeless. (ii) To probe the space of interactions intelligently requires an accurate model of voter behavior, but because the space of interactions is so large, building that model requires observing an enormous number of voters in every jurisdiction in every election -- more voters than there are in most U.S. jurisdictions. (iii) Even with a perfect model of voter behavior, the required number of tests exceeds the number of voters in most U.S. jurisdictions. (iv) The distribution of spoiled ballots, whether BMDs misbehave or not, is unknown and varies by election and presumably by ballot style: historical data are of limited use. Hence, there is no way to calibrate a threshold for passive testing, e.g., to guarantee at least a 95% chance of noticing that 5% of the votes were altered, with at most a 5% false alarm rate. (v) Even if the distribution of spoiled ballots were known to be Poisson, the vast majority of jurisdictions to not have enough voters for passive testing to have a large chance of detecting problems while maintaining a small chance of false alarms.
翻译:错误、 错误配置和恶意软件可能导致选票标记装置( BMDs) 打印错误的选票。 已经提出了几种测试 BMD 的方法。 在逻辑和准确度测试( LAT) 以及平行或现场测试中, 审计员将已知的测试模式输入 BMD 中, 并检查打印匹配。 被动测试监测了选民“ spoil” BMD 打印出的速度。 被动测试监测了“ spoil' BMD ” 打印出的速度, 其理论是, 如果BMDs出错, 比率将会上升。 我们提供的理论下限显示, 这些方法在实际中无法可靠地检测结果改变的问题。 界限很大的原因是:(一) 可能与 BMDDs 的选民互动数量非常庞大, 因此, 随机的测试空间需要准确的选民行为模式, 但是由于互动空间如此之大, 在每个选区的选民选区中, 需要观察一个庞大的选民数量, —— 与多数的选民相比, 。 (三) 即使选民的概率是有限的模式, 在选民行为模式中, 最差的投票的测试中, 也是最差的 。