Social media has become an essential part of the daily routines of children and adolescents. Moreover, enormous efforts have been made to ensure the psychological and emotional well-being of young users as well as their safety when interacting with various social media platforms. In this paper, we investigate the exposure of those users to inappropriate comments posted on YouTube videos targeting this demographic. We collected a large-scale dataset of approximately four million records and studied the presence of five age-inappropriate categories and the amount of exposure to each category. Using natural language processing and machine learning techniques, we constructed ensemble classifiers that achieved high accuracy in detecting inappropriate comments. Our results show a large percentage of worrisome comments with inappropriate content: we found 11% of the comments on children's videos to be toxic, highlighting the importance of monitoring comments, particularly on children's platforms.
翻译:此外,在与各种社交媒体平台互动时,为确保年轻用户的心理和情感福祉以及他们的安全,我们做出了巨大努力。在这份文件中,我们调查这些用户接触YouTube视频中针对这一人口群体发表的不适当评论的情况。我们收集了大约400万个记录的大量数据集,并研究了5个不适龄类别的存在以及接触每一类群体的程度。我们利用自然语言处理和机器学习技术,建立了在发现不适当评论方面达到高度准确程度的混合分类器。我们的结果显示,很大一部分令人担忧的评论内容不适当:我们发现儿童视频评论的11%有毒,突出监测评论的重要性,特别是儿童平台。