Pulmonary hemorrhage (P-Hem) occurs among multiple species and can have various causes. Cytology of bronchoalveolarlavage fluid (BALF) using a 5-tier scoring system of alveolar macrophages based on their hemosiderin content is considered the most sensitive diagnostic method. We introduce a novel, fully annotated multi-species P-Hem dataset which consists of 74 cytology whole slide images (WSIs) with equine, feline and human samples. To create this high-quality and high-quantity dataset, we developed an annotation pipeline combining human expertise with deep learning and data visualisation techniques. We applied a deep learning-based object detection approach trained on 17 expertly annotated equine WSIs, to the remaining 39 equine, 12 human and 7 feline WSIs. The resulting annotations were semi-automatically screened for errors on multiple types of specialised annotation maps and finally reviewed by a trained pathologists. Our dataset contains a total of 297,383 hemosiderophages classified into five grades. It is one of the largest publicly availableWSIs datasets with respect to the number of annotations, the scanned area and the number of species covered.
翻译:肺部出血(P-Hem)在多种物种中发生,可有多种原因。使用五级高超单肺大发体评分系统,使用五级高单肺大发体(BALLF),使用五级高超体积分数系统,根据雌激素含量,被认为是最敏感的诊断方法。我们引入了一个新颖的、充分附加说明的多种P-Hem数据集,由74个细胞整体幻灯片图像组成,配有雌性、雌性以及人类样本。为创建这一高质量和高数量数据集,我们开发了一个批注管道,将人类专门知识与深层次学习和数据可视化技术相结合。我们采用了一种深层次的基于学习的物体探测方法,对17个配有专家注解的Equine WSI进行了培训,对其余的39个电子、12个人类和7个纤维WSI进行了研究。由此产生的说明是半自动筛选,以发现多种类型的专门注解图和经过培训的病理学家最后审查的图谱。我们的数据集中共有297、383个肝镜图解的总数,对可公开分类为5个等级的ISIS最高解号。