This paper presents ClapperText, a benchmark dataset for handwritten and printed text recognition in visually degraded and low-resource settings. The dataset is derived from 127 World War II-era archival video segments containing clapperboards that record structured production metadata such as date, location, and camera-operator identity. ClapperText includes 9,813 annotated frames and 94,573 word-level text instances, 67% of which are handwritten and 1,566 are partially occluded. Each instance includes transcription, semantic category, text type, and occlusion status, with annotations available as rotated bounding boxes represented as 4-point polygons to support spatially precise OCR applications. Recognizing clapperboard text poses significant challenges, including motion blur, handwriting variation, exposure fluctuations, and cluttered backgrounds, mirroring broader challenges in historical document analysis where structured content appears in degraded, non-standard forms. We provide both full-frame annotations and cropped word images to support downstream tasks. Using a consistent per-video evaluation protocol, we benchmark six representative recognition and seven detection models under zero-shot and fine-tuned conditions. Despite the small training set (18 videos), fine-tuning leads to substantial performance gains, highlighting ClapperText's suitability for few-shot learning scenarios. The dataset offers a realistic and culturally grounded resource for advancing robust OCR and document understanding in low-resource archival contexts. The dataset and evaluation code are available at https://github.com/linty5/ClapperText.
翻译:本文提出ClapperText,一个面向视觉退化且资源匮乏场景下手写与印刷体文本识别的基准数据集。该数据集源自127段二战时期档案视频片段,其中包含记录结构化制作元数据(如日期、地点、摄像师身份)的场记板。ClapperText包含9,813个标注帧与94,573个单词级文本实例,其中67%为手写体,1,566个存在部分遮挡。每个实例均包含转写文本、语义类别、文本类型及遮挡状态,标注以支持空间精确OCR应用的四点多边形旋转边界框形式提供。场记板文本识别面临显著挑战,包括运动模糊、手写笔迹差异、曝光波动及杂乱背景,这反映了历史文档分析中结构化内容以退化、非标准形式呈现的普遍难题。我们提供全帧标注与裁剪单词图像以支持下游任务。通过采用统一的每视频评估协议,我们在零样本与微调条件下对六种代表性识别模型与七种检测模型进行了基准测试。尽管训练集规模较小(18段视频),微调仍带来显著的性能提升,凸显了ClapperText在少样本学习场景中的适用性。本数据集为推进低资源档案场景下鲁棒OCR与文档理解技术提供了真实且具文化根基的研究资源。数据集与评估代码已发布于https://github.com/linty5/ClapperText。