State-of-the-art image and text classification models, such as Convectional Neural Networks and Transformers, have long been able to classify their respective unimodal reasoning satisfactorily with accuracy close to or exceeding human accuracy. However, images embedded with text, such as hateful memes, are hard to classify using unimodal reasoning when difficult examples, such as benign confounders, are incorporated into the data set. We attempt to generate more labeled memes in addition to the Hateful Memes data set from Facebook AI, based on the framework of a winning team from the Hateful Meme Challenge. To increase the number of labeled memes, we explore semi-supervised learning using pseudo-labels for newly introduced, unlabeled memes gathered from the Memotion Dataset 7K. We find that the semi-supervised learning task on unlabeled data required human intervention and filtering and that adding a limited amount of new data yields no extra classification performance.
翻译:最先进的图像和文本分类模型,如对流神经网络和变异器等,长期以来一直能够令人满意地对各自的单一方式推理进行精确接近或超过人的精确度的分类。然而,在将友好的混淆者等困难例子纳入数据集时,嵌入文字的图像,如仇恨的模mememes等,很难使用单一方式推理进行分类。我们试图在Facebook AI 的仇恨Memes数据集之外,再产生更多的标签Memes数据集。为了增加标签的模memes的数量,我们利用从Memotion Dataset 7K 收集的假标签来探索半监督的学习。我们发现,对未标数据的半监督的学习任务需要人类的干预和过滤,加上数量有限的新数据不会产生额外的分类性能 。