Reduction in the cost of Network Cameras along with a rise in connectivity enables entities all around the world to deploy vast arrays of camera networks. Network cameras offer real-time visual data that can be used for studying traffic patterns, emergency response, security, and other applications. Although many sources of Network Camera data are available, collecting the data remains difficult due to variations in programming interface and website structures. Previous solutions rely on manually parsing the target website, taking many hours to complete. We create a general and automated solution for aggregating Network Camera data spread across thousands of uniquely structured web pages. We analyze heterogeneous web page structures and identify common characteristics among 73 sample Network Camera websites (each website has multiple web pages). These characteristics are then used to build an automated camera discovery module that crawls and aggregates Network Camera data. Our system successfully extracts 57,364 Network Cameras from 237,257 unique web pages.
翻译:网络相机费用减少,连通性增加,使世界各地的实体能够部署大量的相机网络。网络相机提供实时视觉数据,可用于研究交通模式、应急反应、安全和其他应用程序。虽然网络相机数据有许多来源,但由于编程接口和网站结构的变化,收集数据仍很困难。以前的解决办法依靠手工分割目标网站,需要许多小时才能完成。我们为汇集分布在数千个独特结构化网页上的网络相机数据创建了一个通用自动解决方案。我们分析了各种网页结构,并确定了73个网络样本相机网站的共同特点(每个网站都有多个网页)。这些特点随后被用来建立一个自动相机发现模块,以爬行和汇总网络相机数据。我们的系统成功地从237 257个独特的网页中提取了57 364个网络相机。