Smart speakers collect voice input that can be used to infer sensitive information about users. Given a number of egregious privacy breaches, there is a clear unmet need for greater transparency and control over data collection, sharing, and use by smart speaker platforms as well as third party skills supported on them. To bridge the gap, we build an auditing framework that leverages online advertising to measure data collection, its usage, and its sharing by the smart speaker platforms. We evaluate our framework on the Amazon smart speaker ecosystem. Our results show that Amazon and third parties (including advertising and tracking services) collect smart speaker interaction data. We find that Amazon processes voice data to infer user interests and uses it to serve targeted ads on-platform (Echo devices) as well as off-platform (web). Smart speaker interaction leads to as much as 30X higher ad bids from advertisers. Finally, we find that Amazon's and skills' operational practices are often not clearly disclosed in their privacy policies.
翻译:智能演讲者收集可以用来推断用户敏感信息的语音信息。 鉴于一系列令人震惊的隐私侵犯,显然没有满足对数据收集、共享和使用更加透明和控制的必要性,智能演讲者平台以及支持这些平台的第三方技能也明显需要。为了缩小差距,我们建立了一个审计框架,利用在线广告来衡量数据收集、其使用和智能演讲者平台共享的数据。我们评估了亚马逊智能演讲者生态系统的框架。我们的结果显示亚马逊和第三方(包括广告和跟踪服务)收集智能演讲者互动数据。我们发现亚马逊处理语音数据,以推断用户兴趣并利用这些数据在平台(Echo设备)和平台外(网络)上提供有针对性的广告。智能演讲者互动导致广告商提出多达30x更高的标价。最后,我们发现亚马逊和技能的业务做法往往没有在其隐私政策中明确披露。