Many scientific fields -- including biology, health, education, and the social sciences -- use machine learning (ML) to help them analyze data at an unprecedented scale. However, ML researchers who develop advanced methods rarely provide detailed tutorials showing how to apply these methods. Existing tutorials are often costly to participants, presume extensive programming knowledge, and are not tailored to specific application fields. In an attempt to democratize ML methods, we organized a year-long, free, online tutorial series targeted at teaching advanced natural language processing (NLP) methods to computational social science (CSS) scholars. Two organizers worked with fifteen subject matter experts to develop one-hour presentations with hands-on Python code for a range of ML methods and use cases, from data pre-processing to analyzing temporal variation of language change. Although live participation was more limited than expected, a comparison of pre- and post-tutorial surveys showed an increase in participants' perceived knowledge of almost one point on a 7-point Likert scale. Furthermore, participants asked thoughtful questions during tutorials and engaged readily with tutorial content afterwards, as demonstrated by 10K~total views of posted tutorial recordings. In this report, we summarize our organizational efforts and distill five principles for democratizing ML+X tutorials. We hope future organizers improve upon these principles and continue to lower barriers to developing ML skills for researchers of all fields.
翻译:许多科学领域 -- -- 包括生物学、卫生、教育和社会科学 -- -- 利用机器学习(ML)来帮助他们以前所未有的规模分析数据。然而,开发先进方法的ML研究人员很少提供详细的辅导,说明如何应用这些方法。现有的辅导对参与者来说往往费用高昂,假设有广泛的编程知识,而且没有专门针对特定应用领域。为了使ML方法民主化,我们组织了一个为期一年的免费在线辅导系列,目的是向计算社会科学的学者教授教授先进的自然语言处理(NLP)方法。两个组织者与15个专题专家合作,为多种ML方法和使用案例制作了一小时的讲义手式Python代码,从数据预处理到分析语言变化的时间变异。尽管现场参与比预期的要少,但是对学前和学后调查的比较表明,学员们对7点为 " Lirrt " 级的高级自然语言处理(NLP)方法的认识几乎增加了一个百分点。此外,两个组织者在辅导期间询问了深思熟虑的问题,并随后与辅导内容进行了密切接触,这体现在10K-Gnatimputal-hilling Orimal roduction rodustrucredudududududududustrational rodudududududududududududucles,我们关于发展了5-hicredudududustrs