We explore the understudied area of social payments to evaluate whether or not we can predict the gender and political affiliation of Venmo users based on the content of their Venmo transactions. Latent attribute detection has been successfully applied in the domain of studying social media. However, there remains a dearth of previous work using data other than Twitter. There is also a continued need for studies which explore mobile payments spaces like Venmo, which remain understudied due to the lack of data access. We hypothesize that using methods similar to latent attribute analysis with Twitter data, machine learning algorithms will be able to predict gender and political affiliation of Venmo users with a moderate degree of accuracy. We collected crowdsourced training data that correlates participants' political views with their public Venmo transaction history through the paid Prolific service. Additionally, we collected 21 million public Venmo transactions from recently active users to use for gender classification. We then ran the collected data through a TF-IDF vectorizer and used that to train a support vector machine (SVM). After hyperparameter training and additional feature engineering, we were able to predict user's gender with a high level of accuracy (.91) and had modest success predicting user's political orientation (.63).
翻译:我们探索社会支付方面研究不足的领域,以评价我们是否能够根据Venmo用户的交易内容预测Venmo用户的性别和政治联系。在研究社交媒体的领域成功应用了隐性属性检测。然而,利用Twitter以外的数据,以往的工作仍然缺乏。我们还继续需要研究诸如Venmo这样的移动支付空间,因为由于缺乏数据访问,Venmo仍然受到忽视。我们假设,使用类似于Twitter数据潜在属性分析的方法,机器学习算法将能够预测Venmo用户的性别和政治联系,并且有一定的准确性。我们收集了将参与者的政治观点与其公共Venmo交易历史联系起来的多方来源培训数据,通过付费的Prolific服务。此外,我们从最近活跃的用户那里收集了2 100万个公共Venmo交易,用于性别分类。我们随后通过TF-IDF的病媒控制器管理所收集的数据,并用于培训支持矢量机(SVM)。在超分光计培训和更多特征工程之后,我们得以预测用户的性别状况(63)的高度准确性预测(91)和适度的成功。