他们可以看,看,但看不到:无法充分测试BMDs (They may look and look, yet not see: BMDs cannot be tested adequately)

Bugs, misconfiguration, and malware can cause ballot-marking devices (BMDs) to print incorrect votes. Several approaches to testing BMDs have been proposed. In logic and accuracy testing (LAT) and parallel or live testing, auditors input known test votes into the BMD and check the printout. Passive testing monitors the rate of "spoiled" BMD printout, on the theory that if BMDs malfunction, the rate will increase noticeably. We show that these approaches cannot reliably detect outcome-altering problems, because: (i) The number of possible interactions with BMDs is enormous, so testing interactions uniformly at random is hopeless. (ii) To probe the space of interactions intelligently requires an accurate model of voter behavior, but because the space of interactions is so large, building an accurate model requires observing a huge number of voters in every jurisdiction in every election--more voters than there are in most jurisdictions. (iii) Even with a perfect model of voter behavior, the number of tests needed exceeds the number of voters in most jurisdictions. (iv) An attacker can target interactions that are expensive to test, e.g., because they involve voting slowly; or interactions for which tampering is less likely to be noticed, e.g., because the voter uses the audio interface. (v) Whether BMDs misbehave or not, the distribution of spoiled ballots is unknown and varies by election and possibly by ballot style: historical data do not help much. Hence, there is no way to calibrate a threshold for passive testing, e.g., to guarantee at least a 95% chance of noticing that 5% of the votes were altered, with at most a 5% false alarm rate. (vi) Even if the distribution of spoiled ballots were known to be Poisson, the vast majority of jurisdictions do not have enough voters for passive testing to have a large chance of detecting problems but only a small chance of false alarms.

翻译：错误、错误配置和恶意软件可能导致选举标记设备( BMDs) 打印错误的选票。已经提出了几种测试 BMD 的方法。在逻辑和准确度测试( LAT) 以及平行或现场测试中, 审计员将已知的测试票输入到 BMD 中并检查打印出。被动测试会监测“ 被破坏的” BMD 打印出的比率, 其理论是, 如果BMDs出错, 比率会明显上升。我们显示, 这些方法无法可靠地检测到结果改变的问题, 因为:(一) 与 BMDs 可能发生的互动次数不太多, 所以随机的测试完全没有希望。 (二) 明智地探测互动空间需要精确的选民行为模式, 但是由于互动的空间太大, 建立一个准确的模型需要观察每个选区的选民数量, 比大多数选区的选民都多。 (三) 即便有完美的选民行为模式, 也需要测试的数量超过大多数选区的选民。 (四) 攻击者可以进行帮助的风格。 (四) 攻击者可以以最昂贵的方式进行不易的测试, 。