It is essential for infrastructure managers to maintain a high standard to ensure user satisfaction during daily operations. Surveillance cameras and drone inspections have enabled progress toward automating the inspection of damaged features and assessing the health condition of the deterioration. When we prepare a pair of raw images and damage class labels, we can train supervised learning toward the predefined damage grade, displacement. However, such a damage representation does not constantly match the predefined classes of damage grade, hence, there may be some detailed clusters from the unseen damage space or more complex clusters from overlapped space between two damage grades. The damage representation has fundamentally complex features, consequently, all the damage classes could not be perfectly predefined. Our proposed MN-pair contrastive learning method enables us to explore the embedding damage representation beyond the predefined classes including more detailed clusters. It maximizes the similarity of M-1 positive images close to the anchor, and simultaneously maximize the dissimilarity of N-1 negative ones, using both weighting loss functions. It has been learning faster than the N-pair algorithm, instead of using one positive image. We propose a pipeline to learn damage representation and use density-based clustering on the 2-D reduction space to automate finer cluster discrimination. We also visualize the explanation of the damage feature using Grad-CAM for MN-pair damage metric learning. We demonstrate our method in three experimental studies: steel product defect, concrete crack of deck and pavement, and sewer pipe defect and mention its effectiveness and discuss potential future works.
翻译:基础设施管理者必须保持高标准,以确保用户在日常行动中的满意度; 监控摄像头和无人驾驶飞机的检查使受损特征的检查自动化,并评估恶化的健康状况。 当我们准备一对原始图像和损坏类标签时,我们可以对预先定义的损坏等级、迁移等级进行有监督的学习; 然而,这种损坏表示方式并不始终与预先定义的损坏等级相符,因此,可能有一些来自无形损坏空间的详细组群,或两个损坏等级之间重叠的更复杂的组群。 损坏表示方式具有根本复杂的特征, 因此所有损坏类别都不可能完全预先确定。 我们提议的MN- Pair对比学习方法使我们能够探索在预先定义的类别之外嵌入损害代表层,包括更详细的分类。 它能最大限度地提高M-1正面图像的相似性,同时利用重量损失功能使甲级的负差最大化。 它比 N-pair 算法更快,而不是使用一个正面的图像。 我们提议在2-D级的底线上进行一个基于密度的代表和使用基于密度的底线分组组合, 并用我们的视觉分析。