Citizen science has become a popular tool for preliminary data processing tasks, such as identifying and counting Lunar impact craters in modern high-resolution imagery. However, use of such data requires that citizen science products are understandable and reliable. Contamination and missing data can reduce the usefulness of datasets so it is important that such effects are quantified. This paper presents a method, based upon a newly developed quantitative pattern recognition system (Linear Poisson Models) for estimating levels of contamination within MoonZoo citizen science crater data. Evidence will show that it is possible to remove the effects of contamination, with reference to some agreed upon ground truth, resulting in estimated crater counts which are highly repeatable. However, it will also be shown that correcting for missing data is currently more difficult to achieve. The techniques are tested on MoonZoo citizen science crater annotations from the Apollo 17 site and also undergraduate and expert results from the same region.
翻译:公民科学已成为初步数据处理任务的一个流行工具,如在现代高分辨率图像中识别和计算月球撞击弹坑;然而,使用这些数据需要公民科学产品是可理解和可靠的;污染和缺失数据可以降低数据集的有用性,因此这种影响必须量化;本文件根据新开发的定量模式识别系统(Lineear Poisson模型),为估算月球动物园公民科学弹坑数据内的污染程度提供了一种方法;证据将表明,有可能消除污染的影响,并参照一些已商定的地面事实,导致大量重复的估计弹坑数;不过,还将表明,目前更难以纠正缺失数据;这些技术在MoonZoo第17号星点的公民科学弹坑说明上测试,以及同一区域的本科生和专家结果。