With water quality management processes, identifying and interpreting relationships between features, such as location and weather variable tuples, and water quality variables, such as levels of bacteria, is key to gaining insights and identifying areas where interventions should be made. There is a need for a search process to identify the locations and types of phenomena that are influencing water quality and a need to explain how the quality is being affected and which factors are most relevant. This paper addresses both of these issues. A process is developed for collecting data for features that represent a variety of variables over a spatial region and which are used for training models and inference. An analysis of the performance of the features is undertaken using the models and Shapley values. Shapley values originated in cooperative game theory and can be used to aid in the interpretation of machine learning results. Evaluations are performed using several machine learning algorithms and water quality data from the Dublin Grand Canal basin.
翻译:利用水质管理程序,查明和解释位置和天气变量等地物与细菌水平等地物之间的关系,以及水质变量之间的关系,是获得洞察力和确定应进行干预的领域的关键,需要通过搜索过程查明影响水质的各种现象的地点和类型,并需要解释质量如何受到影响以及哪些因素最为相关,本文件讨论了这两个问题。正在开发一个进程,收集代表空间区域各种变量的地物的数据,这些变量用于培训模型和推断。利用模型和Shapley价值对这些地物的性能进行了分析。从合作游戏理论中产生出来的暗淡价值,可用于帮助解释机器学习结果。评价工作使用多功能学习算法和都柏林大运河流域的水质量数据进行。