Internet-of-Things (IoT) devices are known to be the source of many security problems, and as such, they would greatly benefit from automated management. This requires robustly identifying devices so that appropriate network security policies can be applied. We address this challenge by exploring how to accurately identify IoT devices based on their network behavior, while leveraging approaches previously proposed by other researchers. We compare the accuracy of four different previously proposed machine learning models (tree-based and neural network-based) for identifying IoT devices. We use packet trace data collected over a period of six months from a large IoT test-bed. We show that, while all models achieve high accuracy when evaluated on the same dataset as they were trained on, their accuracy degrades over time, when evaluated on data collected outside the training set. We show that on average the models' accuracy degrades after a couple of weeks by up to 40 percentage points (on average between 12 and 21 percentage points). We argue that, in order to keep the models' accuracy at a high level, these need to be continuously updated.
翻译:众所周知,互联网(IoT)装置是许多安全问题的根源,因此,自动化管理将大大有利于这些装置。这要求严格识别装置,以便适用适当的网络安全政策。我们通过探索如何根据网络行为准确识别IoT装置,同时利用其他研究人员先前提议的方法来应对这一挑战。我们比较了四个先前提议的不同机器学习模型(基于树木和神经网络的模型)的准确性,以识别IoT装置。我们使用从大型IoT试验床收集的6个月时间段的成套跟踪数据。我们表明,尽管所有模型在按照所培训的同一数据集进行评估时都具有很高的准确性,但在根据在培训组之外收集的数据进行评估时,其准确性会随着时间推移而下降。我们显示,平均几周后模型的准确性会下降40个百分点(平均介于12至21个百分点之间)。我们说,为了将模型的准确性保持在高水平上,这些模型需要不断更新。