The IoT market is diverse and characterized by a multitude of vendors that support different device functions (e.g., speaker, camera, vacuum cleaner, etc.). Within this market, IoT security and observability systems use real-time identification techniques to manage these devices effectively. Most existing IoT identification solutions employ machine learning techniques that assume the IoT device, labeled by both its vendor and function, was observed during their training phase. We tackle a key challenge in IoT labeling: how can an AI solution label an IoT device that has never been seen before and whose label is unknown? Our solution extracts textual features such as domain names and hostnames from network traffic, and then enriches these features using Google search data alongside catalog of vendors and device functions. The solution also integrates an auto-update mechanism that uses Large Language Models (LLMs) to update these catalogs with emerging device types. Based on the information gathered, the device's vendor is identified through string matching with the enriched features. The function is then deduced by LLMs and zero-shot classification from a predefined catalog of IoT functions. In an evaluation of our solution on 97 unique IoT devices, our function labeling approach achieved HIT1 and HIT2 scores of 0.7 and 0.77, respectively. As far as we know, this is the first research to tackle AI-automated IoT labeling.
翻译:暂无翻译