Many internet platforms that collect behavioral big data use it to predict user behavior for internal purposes and for their business customers (e.g., advertisers, insurers, security forces, governments, political consulting firms) who utilize the predictions for personalization, targeting, and other decision-making. Improving predictive accuracy is therefore extremely valuable. Data science researchers design algorithms, models, and approaches to improve prediction. Prediction is also improved with larger and richer data. Beyond improving algorithms and data, platforms can stealthily achieve better prediction accuracy by "pushing" users' behaviors towards their predicted values, using behavior modification techniques, thereby demonstrating more certain predictions. Such apparent "improved" prediction can unintentionally result from employing reinforcement learning algorithms that combine prediction and behavior modification. This strategy is absent from the machine learning and statistics literature. Investigating its properties requires integrating causal with predictive notation. To this end, we incorporate Pearl's causal do(.) operator into the predictive vocabulary. We then decompose the expected prediction error given behavior modification, and identify the components impacting predictive power. Our derivation elucidates implications of such behavior modification to data scientists, platforms, their customers, and the humans whose behavior is manipulated. Behavior modification can make users' behavior more predictable and even more homogeneous; yet this apparent predictability might not generalize when customers use predictions in practice. Outcomes pushed towards their predictions can be at odds with customers' intentions, and harmful to manipulated users.
翻译:收集行为大数据的许多互联网平台利用这些数据来预测用户行为,以达到内部目的和为商业客户(例如广告商、保险商、保安力量、政府、政治咨询公司)预测用户行为,这些客户利用预测进行个人化、目标选择和其他决策。因此,提高预测准确性是极其宝贵的。数据科学研究者设计了算法、模型以及改进预测的方法。预测还用更多、更丰富的数据加以改进。除了改进算法和数据外,平台还可以悄悄地通过“推动”用户行为,使其预测值达到更好的预测准确性,使用行为改变技术,从而展示更多的预测。这些显而易见的“改进”预测可能无意地通过使用强化学习算法,将预测和行为改变结合起来。这一战略在机器学习和统计文献中是不存在的。调查其特性需要将因果关系与预测性注释结合起来。为此,我们将珍珠的因果关系操作者(......) 操作者在预测词汇中可以将预期的预测错误分解,并识别影响预测力的成分。我们推移的推移,而这种“改进”的预测力则不会使客户更明显地改变其行为改变行为,而使客户更精确的预测行为发生。