Code-switching (CS) is a common linguistic phenomenon exhibited by multilingual individuals, where they tend to alternate between languages within one single conversation. CS is a complex phenomenon that not only encompasses linguistic challenges, but also contains a great deal of complexity in terms of its dynamic behaviour across speakers. Given that the factors giving rise to CS vary from one country to the other, as well as from one person to the other, CS is found to be a speaker-dependant behaviour, where the frequency by which the foreign language is embedded differs across speakers. While several researchers have looked into predicting CS behaviour from a linguistic point of view, research is still lacking in the task of predicting user CS behaviour from sociological and psychological perspectives. We provide an empirical user study, where we investigate the correlations between users' CS levels and character traits. We conduct interviews with bilinguals and gather information on their profiles, including their demographics, personality traits, and traveling experiences. We then use machine learning (ML) to predict users' CS levels based on their profiles, where we identify the main influential factors in the modeling process. We experiment with both classification as well as regression tasks. Our results show that the CS behaviour is affected by the relation between speakers, travel experiences as well as Neuroticism and Extraversion personality traits.
翻译:代码转换(CS)是多语种个人展示的一种常见的语言现象,他们往往在同一次对话中在不同语言之间互换语言。 CS是一个复杂的现象,不仅包含语言挑战,而且从不同发言者的动态行为方面包含大量复杂因素。鉴于引起 CS的因素因国而异,也因人而异,CS被认为是一种依赖语言的人的行为,其外语的嵌入频率因发言者而异。虽然一些研究人员从语言角度出发,对CS行为进行预测,但从社会学和心理角度预测用户 CS行为的任务中仍然缺乏研究。我们提供一项经验性用户研究,调查用户CS水平和性格特征之间的相互关系。我们用双语进行访谈,并收集其概况信息,包括其人口、个性特征和旅行经验。我们随后使用机器学习(ML)根据用户的外语特征预测CS水平,我们从语言角度查明了CS行为,从社会学和心理角度预测用户 CS行为的主要影响因素。我们用分类和外性特征来试验C的外性关系,我们用外性特征来显示C和外性特征。