将出版物与项目一级供资挂钩:FP7项目报告的出版物汇编数据集 (Linking Publications to Funding at Project Level: A curated dataset of publications reported by FP7 projects)

Datasets explicitly linking publications to funding at project level are the basis of evaluative bibliometric analysis of funding programmes. Analysis of the impact of the EU funding programmes has been often frustrated by the lack of data on publications to which the funding has contributed. Here we present a dataset of scholarly publications reported by the projects funded by the European Union under the 7th Framework Programme. The dataset was created by first consolidating data from different reporting channels and validating the records by systematically matching them to external authoritative sources and assigning them external identifiers. The initial dataset had 305k records linked to one or more projects out of which 69% had a digital object identify (doi). Through the data quality assurance, we validate 93% of the initial records (283k) and assign a doi to 90% of them of them (245k). The resulting dataset has 245k unique dois (linked to one or more projects). It is, to our knowledge, the first comprehensive and curated dataset of scholarly outputs of the Framework Programme as reported by the grant holders. The dataset could only be created thanks to significant improvements and investments made in the reporting systems used by EU funded projects. The dataset is available EU open data portal: https://data.europa.eu/data/datasets/cordisfp7projects

翻译：将出版物与项目一级供资明确挂钩的数据集是供资方案评价二元分析的基础。对欧盟供资方案的影响的分析往往因缺少关于供资所资助的出版物的数据而受挫。我们在这里展示了欧洲联盟资助项目在第7个框架方案下报告的学术出版物数据集。数据集最初通过将不同报告渠道的数据与外部权威来源系统地匹配并指定外部识别资料,从而验证记录,从而创建了数据集。初始数据集有305k个记录与一个或多个项目链接,其中69%的项目有数字对象识别(doi)。通过数据质量保证,我们验证了93%的初始记录(283k),并将其中的90%指定为Doi(245k)。由此产生的数据集有245k独有的版本(与一个或多个项目相关)。据我们所知,这是由赠款持有者报告的《框架方案》首次全面和整理的学术产出数据集。数据集的创建只能归功于欧盟供资/数据门户使用的报告系统的重大改进和投资。现有数据是:httpeurdata/httpeurdata。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日