RF fingerprinting is emerging as a physical layer security scheme to identify illegitimate and/or unauthorized emitters sharing the RF spectrum. However, due to the lack of publicly accessible real-world datasets, most research focuses on generating synthetic waveforms with software-defined radios (SDRs) which are not suited for practical deployment settings. On other hand, the limited datasets that are available focus only on chipsets that generate only one kind of waveform. Commercial off-the-shelf (COTS) combo chipsets that support two wireless standards (for example WiFi and Bluetooth) over a shared dual-band antenna such as those found in laptops, adapters, wireless chargers, Raspberry Pis, among others are becoming ubiquitous in the IoT realm. Hence, to keep up with the modern IoT environment, there is a pressing need for real-world open datasets capturing emissions from these combo chipsets transmitting heterogeneous communication protocols. To this end, we capture the first known emissions from the COTS IoT chipsets transmitting WiFi and Bluetooth under two different time frames. The different time frames are essential to rigorously evaluate the generalization capability of the models. To ensure widespread use, each capture within the comprehensive 72 GB dataset is long enough (40 MSamples) to support diverse input tensor lengths and formats. Finally, the dataset also comprises emissions at varying signal powers to account for the feeble to high signal strength emissions as encountered in a real-world setting.
翻译:RF指纹技术正在崛起成为物理层安全方案,以识别共享RF频谱的非法或未经授权的发射机。然而,由于缺乏公开可访问的真实世界数据集,大多数研究都集中于使用软件定义电台(SDR)生成合成波形,这些波形不适合于实际的部署环境。另一方面,已有的有限数据集专注于仅生成一种波形的芯片组。商用现成(COTS)复合芯片组支持两种无线标准(例如WiFi和蓝牙)在共享双频天线上进行通信,这种芯片组在笔记本电脑、适配器、无线充电器、树莓派等IoT领域变得日益普及。因此,为了跟上现代IoT环境的步伐,有必要获取现实世界的开放数据集,捕获来自这些传输异构通信协议的组合芯片组的发射。为此,我们首次捕获了在两个不同时间范围内发送WiFi和蓝牙信号的COTS IoT芯片组发射的信号。不同的时间范围对于严格评估模型的泛化能力至关重要。为了确保广泛使用,综合72GB数据集中的每次捕捉都足够长(40MSamples),支持多样化的输入张量长度和格式。最后,数据集还包括不同信号功率下的发射,以考虑在现实世界环境中遇到的弱到高信号强度的发射。