Open-source software is a fundamental part of the internet and the cyber supply chain, but its exploitation has become more frequent. While vulnerability detection in OSS has advanced, previous work mainly focuses on static code analysis, neglecting runtime indicators. To address this, we created a dataset spanning multiple ecosystems, capturing features generated during the execution of packages and libraries in isolated environments. The dataset includes 9,461 package reports (1,962 malicious), with static and dynamic features such as files, sockets, commands, and DNS records. Labeled with verified information and detailed sub-labels for attack types, this dataset helps identify malicious indicators, especially when source code access is limited, and supports efficient detection methods during runtime.
翻译:暂无翻译