Classification models are a fundamental component of physical-asset management technologies such as structural health monitoring (SHM) systems and digital twins. Previous work introduced risk-based active learning, an online approach for the development of statistical classifiers that takes into account the decision-support context in which they are applied. Decision-making is considered by preferentially querying data labels according to expected value of perfect information (EVPI). Although several benefits are gained by adopting a risk-based active learning approach, including improved decision-making performance, the algorithms suffer from issues relating to sampling bias as a result of the guided querying process. This sampling bias ultimately manifests as a decline in decision-making performance during the later stages of active learning, which in turn corresponds to lost resource/utility. The current paper proposes two novel approaches to counteract the effects of sampling bias: semi-supervised learning, and discriminative classification models. These approaches are first visualised using a synthetic dataset, then subsequently applied to an experimental case study, specifically, the Z24 Bridge dataset. The semi-supervised learning approach is shown to have variable performance; with robustness to sampling bias dependent on the suitability of the generative distributions selected for the model with respect to each dataset. In contrast, the discriminative classifiers are shown to have excellent robustness to the effects of sampling bias. Moreover, it was found that the number of inspections made during a monitoring campaign, and therefore resource expenditure, could be reduced with the careful selection of the statistical classifiers used within a decision-supporting monitoring system.
翻译:分类模型是物理资产管理技术的基本组成部分,如结构健康监测系统和数字双胞胎。先前的工作引入了基于风险的积极学习,这是发展统计分类员的在线方法,考虑到应用这些分类员的决策支持环境;根据完美信息的预期价值,以优先查询数据标签的方式考虑决策;虽然通过采用基于风险的积极学习方法,包括改进决策性能,算法取得了若干好处,但由于有指导的查询程序,这些算法也因抽样偏差问题而受到影响。这种抽样偏差最终表现为在积极学习后期的决策性表现下降,这反过来又与资源/用途损失相对应。目前的文件提出了两种新办法,以抵消抽样偏差的影响:半超额学习和歧视性分类模式。这些办法首先采用综合数据集进行视觉化,随后可应用于实验性案例研究,特别是Z24桥数据集。半监督学习方法显示,在积极学习的后期阶段,决策性业绩表现不一变;因此,精确的可靠性与精确性评估相比,在抽样过程中,对稳性评估的可靠性进行了精确性分析,因此,在抽样检查期间,对稳性分析性分析性结果的分布进行了精确性评估。