With water quality management processes, identifying and interpreting relationships between features, such as location and weather variable tuples, and water quality variables, such as levels of bacteria, is key to gaining insights and identifying areas where interventions should be made. There is a need for a search process to identify the locations and types of phenomena that are influencing water quality and a need to explain why the quality is being affected and which factors are most relevant. This paper addresses both of these issues through the development of a process for collecting data for features that represent a variety of variables over a spatial region, which are used for training and inference, and analysing the performance of the features using the model and Shapley values. Shapley values originated in cooperative game theory and can be used to aid in the interpretation of machine learning results. Evaluations are performed using several machine learning algorithms and water quality data from the Dublin Grand Canal basin.
翻译:利用水质管理程序,查明和解释位置和天气变量等地物与细菌水平等地物之间的关系,以及水质变量之间的关系,是获得洞察力和确定应采取干预措施的领域的关键,需要通过搜索过程查明影响水质的各种现象的地点和类型,并需要解释质量为何受到影响以及哪些因素最为相关,本文件通过开发一个进程,收集代表空间区域各种变量的地物的数据,处理这两个问题,这些变量被用于培训和推断,并利用模型和Shapley值分析这些特征的性能,这些价值观源自合作游戏理论,可用于帮助解释机器学习结果,利用都柏林大运河盆地的若干机器学习算法和水质数据进行评估。