These days, due to the increasing amount of information generated on the web, most web service providers try to personalize their services. Users also interact with web-based systems in multiple ways and state their interests and preferences by rating the provided items. This paper proposes a framework to predict users' demographic based on ratings registered by users in a system. To the best of our knowledge, this is the first time that the item ratings are employed for users' demographic prediction problems, which have extensively been studied in recommendation systems and service personalization. We apply the framework to the Movielens dataset's ratings and predict users' age and gender. The experimental results show that using all ratings registered by users improves the prediction accuracy by at least 16% compared with previously studied models. Moreover, by classifying the items as popular and unpopular, we eliminate ratings that belong to 95% of items and still reach an acceptable level of accuracy. This significantly reduces update costs in a time-varying environment. Besides this classification, we propose other methods to reduce data volume while keeping the predictions accurate.
翻译:这些天,由于在网上生成的信息越来越多,大多数网络服务提供商都试图将自己的服务个人化。用户还以多种方式与网络系统互动,并通过对所提供的项目进行评级来说明其兴趣和喜好。本文件提出了一个根据用户在系统中登记的评级预测用户人口结构的框架。据我们所知,这是首次对用户的人口预测问题采用项目评级,在建议系统和服务个人化中已经广泛研究了这些问题。我们将框架应用于Mephelens数据集的评级,并预测用户的年龄和性别。实验结果显示,使用用户登记的所有评级,比以往研究的模式至少提高16%的预测准确性。此外,通过将项目分类为流行和不受欢迎的项目,我们取消了95%的项目评级,但仍然达到可接受的准确度。这大大降低了在时间变化的环境中更新成本。除了这一分类外,我们建议了其他方法,以减少数据数量,同时保持预测准确性。