Small angle X-ray scattering (SAXS) is extensively used in materials science as a way of examining nanostructures. The analysis of experimental SAXS data involves mapping a rather simple data format to a vast amount of structural models. Despite various scientific computing tools to assist the model selection, the activity heavily relies on the SAXS analysts' experience, which is recognized as an efficiency bottleneck by the community. To cope with this decision-making problem, we develop and evaluate the open-source, Machine Learning-based tool SCAN (SCattering Ai aNalysis) to provide recommendations on model selection. SCAN exploits multiple machine learning algorithms and uses models and a simulation tool implemented in the SasView package for generating a well defined set of datasets. Our evaluation shows that SCAN delivers an overall accuracy of 95%-97%. The XGBoost Classifier has been identified as the most accurate method with a good balance between accuracy and training time. With eleven predefined structural models for common nanostructures and an easy draw-drop function to expand the number and types training models, SCAN can accelerate the SAXS data analysis workflow.
翻译:在材料科学中广泛使用小角度X射线散射(SAXS)作为审查纳米结构的一种方法。对SAXS实验数据的分析涉及将一个相当简单的数据格式绘制成大量结构模型。尽管有各种科学计算工具协助模型选择,但活动在很大程度上依赖SAXS分析员的经验,社区认为这是效率瓶颈。为了应对这一决策问题,我们开发并评价开放源、机械学习工具SCAN(Sachting Ai anais解析),以提供关于模型选择的建议。SCAN利用多种机器学习算法和使用在SasView软件包中实施的模型和模拟工具来生成一套定义明确的数据集。我们的评估显示,SCAN提供了95%至97%的总体精确度。 XGBoost分类仪被确定为最准确的方法,在准确性和培训时间之间保持良好平衡。有11个通用纳米结构的预设结构模型,以及一个扩大数字和类型培训模型的简单绘图功能,SCAN可以加速SAXS数据分析工作流程。