Collecting data on underground criminal communities is highly valuable both for security research and security operations. Unfortunately these communities live within a constellation of diverse online forums that are difficult to infiltrate, may adopt crawling monitoring countermeasures, and require the development of ad-hoc scrapers for each different community, making the endeavour increasingly technically challenging, and potentially expensive. To address this problem we propose THREAT/crawl, a method and prototype tool for a highly reusable crawler that can learn a wide range of (arbitrary) forum structures, can remain under-the-radar during the crawling activity and can be extended and configured at the user will. We showcase THREAT/crawl capabilities and provide prime evaluation of our prototype against a range of active, live, underground communities.
翻译:收集地下犯罪社区的数据对于安全研究和安保行动都是非常宝贵的。 不幸的是,这些社区生活在一系列各种难以渗透的在线论坛中,可能采取爬行监测对策,并要求为每个不同社区开发临时散装式刮刮机,使得这项工作在技术上越来越具有挑战性,而且可能花费很大。 为了解决这一问题,我们提议为高度可重复使用的爬行者提供一种方法和原型工具,这种工具可以学习多种(任意)论坛结构,在爬行活动期间可以保持在雷达之下,并且可以扩展和配置用户意愿。 我们展示THREAT/crawal能力,并针对一系列活跃的、活的、地下社区提供原型的主要评估。