Crowdsourcing has been used to collect data at scale in numerous fields. Triplet similarity comparison is a type of crowdsourcing task, in which crowd workers are asked the question ``among three given objects, which two are more similar?'', which is relatively easy for humans to answer. However, the comparison can be sometimes based on multiple views, i.e., different independent attributes such as color and shape. Each view may lead to different results for the same three objects. Although an algorithm was proposed in prior work to produce multiview embeddings, it involves at least two problems: (1) the existing algorithm cannot independently predict multiview embeddings for a new sample, and (2) different people may prefer different views. In this study, we propose an end-to-end inductive deep learning framework to solve the multiview representation learning problem. The results show that our proposed method can obtain multiview embeddings of any object, in which each view corresponds to an independent attribute of the object. We collected two datasets from a crowdsourcing platform to experimentally investigate the performance of our proposed approach compared to conventional baseline methods.
翻译:在众多领域, 使用众包来收集规模化的数据。 相近性比较是一种众包任务, 向人群工人询问“ 三个给定对象之间” 的问题, 其中两个对象比较相似? ”, 比较容易让人类回答。 然而, 比较有时可以基于多种观点, 即不同的独立属性, 如颜色和形状。 每种观点都可能导致相同三个对象的不同结果。 虽然先前的工作曾提出一种算法, 以产生多视图嵌入, 但至少涉及两个问题:(1) 现有的算法无法独立预测新样本的多视图嵌入, 并且(2) 不同的人可能更喜欢不同的观点。 在这项研究中, 我们提出了一个结束到结束的深层次学习框架, 以解决多视角的学习问题。 结果表明, 我们提出的方法可以获得任何对象的多视图嵌入, 其中每个视图都与该对象的独立属性相对应。 我们从一个众包平台收集了两个数据集, 以实验性地调查我们拟议方法的绩效, 而不是常规基线方法 。