Web traffic is a valuable data source, typically used in the marketing space to track brand awareness and advertising effectiveness. However, web traffic is also a rich source of information for cybersecurity monitoring efforts. To better understand the threat of malicious cyber actors, this study develops a methodology to monitor and evaluate web activity using data archived from Google Analytics. Google Analytics collects and aggregates web traffic, including information about web visitors' location, date and time of visit, visited webpages, and searched keywords. This study seeks to streamline analysis of this data and uses rule-based anomaly detection and predictive modeling to identify web traffic that deviates from normal patterns. Rather than evaluating pieces of web traffic individually, the methodology seeks to emulate real user behavior by creating a new unit of analysis: the user session. User sessions group individual pieces of traffic from the same location and date, which transforms the available information from single point-in-time snapshots to dynamic sessions showing users' trajectory and intent. The result is faster and better insight into large volumes of noisy web traffic.
翻译:网络流量是一个宝贵的数据来源,通常用于营销空间,以跟踪品牌意识和广告效果。然而,网络流量也是网络安全监测工作的丰富信息来源。为更好地了解恶意网络行为者的威胁,本研究开发了一种方法,利用谷歌分析器存档的数据来监测和评估网络活动。谷歌分析器收集和汇总网络流量,包括有关网络访问者地点、访问日期和时间、访问网页和搜索关键词的信息。本研究力求简化对这些数据的分析,并使用基于规则的异常点探测和预测模型来识别偏离正常模式的网络流量。该方法寻求通过创建新的分析单位(用户会议)来模仿实际用户行为:用户会议。用户会议将同一地点和日期的单个流量分组,将现有信息从单一点的时速快照转变为显示用户轨迹和意图的动态会议。其结果是更快、更清楚地了解大量噪音网络流量。