一个旨在提升互联网阅读体验的工具。 readability.com/

最新内容

Terms and conditions (T&Cs) are pervasive on the web and often contain important information for consumers, but are rarely read. Previous research has explored methods to surface alarming privacy policies using manual labelers, natural language processing, and deep learning techniques. However, this prior work used pre-determined categories for annotations, and did not investigate what consumers really deem as important from their perspective. In this paper, we instead combine crowdsourcing with an open definition of "what is important" in T&Cs. We present a workflow consisting of pairwise comparisons, agreement validation, and Bradley-Terry rank modeling, to effectively establish rankings of T&C statements from non-expert crowdworkers on this open definition, and further analyzed consumers' preferences. We applied this workflow to 1,551 T&C statements from 27 e-commerce websites, contributed by 3,462 unique crowd workers doing 203,068 pairwise comparisons, and conducted thematic and readability analysis on the statements considered as important/unimportant. We found that consumers especially cared about policies related to after-sales and money, and tended to regard harder-to-understand statements as more important. We also present machine learning models to identify T&C clauses that consumers considered important, achieving at best a 92.7% balanced accuracy, 91.6% recall, and 89.2% precision. We foresee using our workflow and model to efficiently and reliably highlight important T&Cs on websites at a large scale, improving consumers' awareness

0
0
下载
预览

最新论文

Terms and conditions (T&Cs) are pervasive on the web and often contain important information for consumers, but are rarely read. Previous research has explored methods to surface alarming privacy policies using manual labelers, natural language processing, and deep learning techniques. However, this prior work used pre-determined categories for annotations, and did not investigate what consumers really deem as important from their perspective. In this paper, we instead combine crowdsourcing with an open definition of "what is important" in T&Cs. We present a workflow consisting of pairwise comparisons, agreement validation, and Bradley-Terry rank modeling, to effectively establish rankings of T&C statements from non-expert crowdworkers on this open definition, and further analyzed consumers' preferences. We applied this workflow to 1,551 T&C statements from 27 e-commerce websites, contributed by 3,462 unique crowd workers doing 203,068 pairwise comparisons, and conducted thematic and readability analysis on the statements considered as important/unimportant. We found that consumers especially cared about policies related to after-sales and money, and tended to regard harder-to-understand statements as more important. We also present machine learning models to identify T&C clauses that consumers considered important, achieving at best a 92.7% balanced accuracy, 91.6% recall, and 89.2% precision. We foresee using our workflow and model to efficiently and reliably highlight important T&Cs on websites at a large scale, improving consumers' awareness

0
0
下载
预览
参考链接
Top