Large scale comparative research into municipal governance is often prohibitively difficult due to a lack of high-quality data. But, recent advances in speech-to-text algorithms and natural language processing has made it possible to more easily collect and analyze data about municipal governments. In this paper, we introduce an open-source platform, the Council Data Project (CDP), to curate novel datasets for research into municipal governance. The contribution of this work is two-fold: 1. We demonstrate that CDP, as an infrastructure, can be used to assemble reliable comparative data on municipal governance; 2. We provide exploratory analysis of three municipalities to show how CDP data can be used to gain insight into how municipal governments perform over time. We conclude by describing future directions for research on and with CDP such as the development of machine learning models for speaker annotation, outline generation, and named entity recognition for improved linked data.
翻译:由于缺乏高质量的数据,对城市治理进行大规模比较研究往往非常困难,但是,最近在语音到文字算法和自然语言处理方面的进展使得更容易收集和分析有关市政府的数据。在本文件中,我们引入了一个开放源码平台,即理事会数据项目(CDP),为城市治理研究整理新的数据集。这项工作的贡献有两个方面:1. 我们证明,作为一个基础设施,社区发展方案可以用来收集关于城市治理的可靠比较数据;2. 我们提供三个城市的探索性分析,说明如何利用社区发展方案的数据来深入了解市政府在一段时间内的表现。我们最后通过描述未来对社区发展方案的研究和与社区发展方案的研究方向,例如开发演讲者笔记机学习模型、提纲生成和实体确认改进关联数据。