Despite the huge advancement in knowledge discovery and data mining techniques, the X-ray diffraction (XRD) analysis process has mostly remained untouched and still involves manual investigation, comparison, and verification. Due to the large volume of XRD samples from high-throughput XRD experiments, it has become impossible for domain scientists to process them manually. Recently, they have started leveraging standard clustering techniques, to reduce the XRD pattern representations requiring manual efforts for labeling and verification. Nevertheless, these standard clustering techniques do not handle problem-specific aspects such as peak shifting, adjacent peaks, background noise, and mixed phases; hence, resulting in incorrect composition-phase diagrams that complicate further steps. Here, we leverage data mining techniques along with domain expertise to handle these issues. In this paper, we introduce an incremental phase mapping approach based on binary peak representations using a new threshold based fuzzy dissimilarity measure. The proposed approach first applies an incremental phase computation algorithm on discrete binary peak representation of XRD samples, followed by hierarchical clustering or manual merging of similar pure phases to obtain the final composition-phase diagram. We evaluate our method on the composition space of two ternary alloy systems- Co-Ni-Ta and Co-Ti-Ta. Our results are verified by domain scientists and closely resembles the manually computed ground-truth composition-phase diagrams. The proposed approach takes us closer towards achieving the goal of complete end-to-end automated XRD analysis.
翻译:尽管在知识发现和数据挖掘技术方面取得了巨大进步,但X射线分解(XRD)分析过程基本上没有动静,仍然涉及人工调查、比较和核查。由于来自高通量XRD实验的XRD样本数量庞大,域科学家无法手工处理这些样本。最近,他们开始利用标准组群技术,减少XRD模式代表方式,要求手工进行标签和核查。然而,这些标准组群技术并不处理问题特定方面,如峰值变化、相邻峰值、背景噪音和混合阶段;因此,造成组成阶段图不正确,使进一步的步骤复杂化。在这里,我们利用数据挖掘技术以及域专长来处理这些问题。在本文件中,我们采用基于基于模糊不相近度测量的新门槛的二进阶段高峰表的渐进阶段绘图方法。提议的方法首先对XRD样品的离散双进点双峰表示算算法,随后将类似的纯级组合组合组合图合并,以获得最后的构成阶段图。我们评估了数据采集方法的两端RDF-TA级分析空间,通过更接近的轨道分析结果。TA-TA-C-C-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-C-C-C-C-C-C-C-C-C-RO-RO-C-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-C-C-C-RO-RO-RO-C-C-C-RO-RO-C-RO-RO-RO-RO-RO-RO-C-RO-RO-RO-C-C-C-RO-C-RO-RO-RO-RO-RO-RO-RO-RO-RO-C-C-C-C-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-A-RO-RO-RO-RO-RO