Drawing causal conclusions from observational real-world data is a very much desired but challenging task. In this paper we present mixed-method analyses to investigate causal influences of publication trends and behavior on the adoption, persistence, and retirement of certain research foci -- methodologies, materials, and tasks that are of interest to the computational linguistics (CL) community. Our key findings highlight evidence of the transition to rapidly emerging methodologies in the research community (e.g., adoption of bidirectional LSTMs influencing the retirement of LSTMs), the persistent engagement with trending tasks and techniques (e.g., deep learning, embeddings, generative, and language models), the effect of scientist location from outside the US, e.g., China on propensity of researching languages beyond English, and the potential impact of funding for large-scale research programs. We anticipate this work to provide useful insights about publication trends and behavior and raise the awareness about the potential for causal inference in the computational linguistics and a broader scientific community.
翻译:从观测真实世界数据中得出因果关系结论是一项非常可取但具有挑战性的任务。我们在本文件中提出混合方法分析,以调查出版物的趋势和行为对某些研究焦点 -- -- 方法、材料和任务 -- -- 的采纳、持久性和退休的因果关系,这些研究焦点是计算语言界感兴趣的。我们的主要调查结果突出表明了研究界向迅速形成的方法过渡的证据(例如采用双向LSTMs影响LSTMs退休)、持续参与趋势化任务和技术(例如深层次学习、嵌入、基因化和语言模型)、美国境外科学家所在地点(例如中国)对英语以外研究语言的倾向的影响,以及大规模研究方案资金的潜在影响。我们预计这项工作将提供关于出版趋势和行为的有用见解,提高对计算语言和更广泛的科学界因果关系可能性的认识。