Compiling large datasets from published resources, such as archaeological find catalogues presents fundamental challenges: identifying relevant content and manually recording it is a time-consuming, repetitive and error-prone task. For the data to be useful, it must be of comparable quality and adhere to the same recording standards, which is hardly ever the case in archaeology. Here, we present a new data collection method exploiting recent advances in Artificial Intelligence. Our software uses an object detection neural network combined with further classification networks to speed up, automate, and standardise data collection from legacy resources, such as archaeological drawings and photographs in large unsorted PDF files. The AI-assisted workflow detects common objects found in archaeological catalogues, such as graves, skeletons, ceramics, ornaments, stone tools and maps, and spatially relates and analyses these objects on the page to extract real-life attributes, such as the size and orientation of a grave based on the north arrow and the scale. A graphical interface allows for and assists with manual validation. We demonstrate the benefits of this approach by collecting a range of shapes and numerical attributes from richly-illustrated archaeological catalogues, and benchmark it in a real-world experiment with ten users. Moreover, we record geometric whole-outlines through contour detection, an alternative to landmark-based geometric morphometrics not achievable by hand.
翻译:暂无翻译