To extract essential information from complex data, computer scientists have been developing machine learning models that learn low-dimensional representation mode. From such advances in machine learning research, not only computer scientists but also social scientists have benefited and advanced their research because human behavior or social phenomena lies in complex data. To document this emerging trend, we survey the recent studies that apply word embedding techniques to human behavior mining, building a taxonomy to illustrate the methods and procedures used in the surveyed papers and highlight the recent emerging trends applying word embedding models to non-textual human behavior data. This survey conducts a simple experiment to warn that common similarity measurements used in the literature could yield different results even if they return consistent results at an aggregate level.
翻译:为了从复杂的数据中提取基本信息,计算机科学家一直在开发学习低维代表模式的机器学习模型,从机器学习研究的这些进步中,不仅计算机科学家,而且社会科学家,由于人类行为或社会现象存在于复杂的数据中,他们已经受益并推进了他们的研究。为了记录这一新出现的趋势,我们调查了最近应用文字将技术嵌入到人类行为采矿中的研究,建立了一个分类学,以说明被调查的论文中使用的方法和程序,并突出将文字嵌入非文字人类行为数据的最新趋势。这项调查进行了简单的实验,以警告文献中使用的共同相似性测量即使总的结果一致,也可能产生不同的结果。