Skin cancer is the most common cancer type. Usually, patients with suspicion of cancer are treated by doctors without any aided visual inspection. At this point, dermoscopy has become a suitable tool to support physicians in their decision-making. However, clinicians need years of expertise to classify possibly malicious skin lesions correctly. Therefore, research has applied image processing and analysis tools to improve the treatment process. In order to perform image analysis and train a model on dermoscopic images data needs to be centralized. Nevertheless, data centralization does not often comply with local data protection regulations due to its sensitive nature and due to the loss of sovereignty if data providers allow unlimited access to the data. A method to circumvent all privacy-related challenges of data centralization is Distributed Analytics (DA) approaches, which bring the analysis to the data instead of vice versa. This paradigm shift enables data analyses - in our case, image analysis - with data remaining inside institutional borders, i.e., the origin. In this documentation, we describe a straightforward use case including a model training for skin lesion classification based on decentralised data.
翻译:皮肤癌是最常见的癌症类型。 通常,怀疑患有癌症的病人在未经任何辅助视觉检查的情况下由医生治疗。 目前,脱温检查已成为支持医生决策的合适工具。然而,临床医生需要多年的专门知识来正确分类可能的恶性皮肤损伤。因此,研究应用图像处理和分析工具来改进治疗过程。为了进行图像分析和培训脱温图像数据模型,需要集中管理。然而,由于数据集中的性质敏感,而且如果数据提供者允许不受限制地访问数据,则往往不遵守当地数据保护条例,并且由于数据提供者允许不受限制地访问数据而丧失了主权。一种规避与隐私有关的所有数据集中化挑战的方法是分散分析方法,将分析结果带给数据,而不是反之。这种模式转变使得数据分析(就我们而言,图像分析)能够将数据留在机构边界内,即来源内。在这个文件中,我们描述了一个直接使用的例子,包括基于分散数据的皮肤损伤分类的示范培训。