会计联系的深入计量学习方法 (A Deep Metric Learning Approach to Account Linking)

We consider the task of linking social media accounts that belong to the same author in an automated fashion on the basis of the content and metadata of their corresponding document streams. We focus on learning an embedding that maps variable-sized samples of user activity -- ranging from single posts to entire months of activity -- to a vector space, where samples by the same author map to nearby points. The approach does not require human-annotated data for training purposes, which allows us to leverage large amounts of social media content. The proposed model outperforms several competitive baselines under a novel evaluation framework modeled after established recognition benchmarks in other domains. Our method achieves high linking accuracy, even with small samples from accounts not seen at training time, a prerequisite for practical applications of the proposed linking framework.

翻译：我们考虑将属于同一作者的社交媒体账户根据其相应文件流的内容和元数据自动地连接起来的任务。我们侧重于学习将用户活动(从单个站到整个月的活动)的可变规模样本嵌入到矢量空间,即同一作者地图的样本到附近点。这一方法并不要求为培训目的提供附加说明的数据,从而使我们能够利用大量社交媒体内容。拟议模式在新颖的评估框架下优于若干竞争性基线,而新颖的评估框架是根据其他领域既定的确认基准建模。我们的方法实现了高度的连结性,即使从培训时看不到的小账户样本,也是实际应用拟议链接框架的先决条件。

相关内容

度量学习

关注 3372

度量学习的目的为了衡量样本之间的相近程度，而这也正是模式识别的核心问题之一。大量的机器学习方法，比如K近邻、支持向量机、径向基函数网络等分类方法以及K-means聚类方法，还有一些基于图的方法，其性能好坏都主要有样本之间的相似度量方法的选择决定。度量学习通常的目标是使同类样本之间的距离尽可能缩小，不同类样本之间的距离尽可能放大。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日