Modern HTTPS mechanisms such as Encrypted Client Hello (ECH) and encrypted DNS improve privacy but remain vulnerable to website fingerprinting (WF) attacks, where adversaries infer visited sites from encrypted traffic patterns. Existing WF methods rely on supervised learning with site-specific labeled traces, which limits scalability and fails to handle previously unseen websites. We address these limitations by reformulating WF as a zero-shot cross-modal retrieval problem and introducing STAR. STAR learns a joint embedding space for encrypted traffic traces and crawl-time logic profiles using a dual-encoder architecture. Trained on 150K automatically collected traffic-logic pairs with contrastive and consistency objectives and structure-aware augmentation, STAR retrieves the most semantically aligned profile for a trace without requiring target-side traffic during training. Experiments on 1,600 unseen websites show that STAR achieves 87.9 percent top-1 accuracy and 0.963 AUC in open-world detection, outperforming supervised and few-shot baselines. Adding an adapter with only four labeled traces per site further boosts top-5 accuracy to 98.8 percent. Our analysis reveals intrinsic semantic-traffic alignment in modern web protocols, identifying semantic leakage as the dominant privacy risk in encrypted HTTPS traffic. We release STAR's datasets and code to support reproducibility and future research.
翻译:现代HTTPS机制(如加密客户端问候(ECH)和加密DNS)虽增强了隐私保护,但仍易受网站指纹识别(WF)攻击,攻击者可通过加密流量模式推断用户访问的站点。现有WF方法依赖于针对特定站点的有监督学习与标注流量轨迹,这限制了方法的可扩展性且无法处理未见过的新网站。为克服这些局限,我们将WF重新定义为零样本跨模态检索问题,并提出了STAR方法。STAR采用双编码器架构,学习加密流量轨迹与爬取时逻辑描述文件的联合嵌入空间。该方法基于15万对自动采集的流量-逻辑数据对进行训练,结合对比学习与一致性目标,并采用结构感知的数据增强技术,使得STAR能够在无需训练阶段目标端流量的情况下,为给定流量轨迹检索语义最匹配的逻辑描述文件。在1,600个未见网站上的实验表明,STAR在开放世界检测中达到87.9%的top-1准确率与0.963的AUC值,性能优于有监督及少样本基线方法。若为每个站点添加仅需四条标注流量轨迹的适配器,可进一步将top-5准确率提升至98.8%。我们的分析揭示了现代网络协议中固有的语义-流量对齐特性,指出语义泄漏已成为加密HTTPS流量的主要隐私风险。为支持可复现性与后续研究,我们公开了STAR的数据集与代码。