Air pollution has altered the Earth radiation balance, disturbed the ecosystem and increased human morbidity and mortality. Accordingly, a full-coverage high-resolution air pollutant dataset with timely updates and historical long-term records is essential to support both research and environmental management. Here, for the first time, we develop a near real-time air pollutant database known as Tracking Air Pollution in China (TAP, tapdata.org) that combines information from multiple data sources, including ground measurements, satellite retrievals, dynamically updated emission inventories, operational chemical transport model simulations and other ancillary data. Daily full-coverage PM2.5 data at a spatial resolution of 10 km is our first near real-time product. The TAP PM2.5 is estimated based on a two-stage machine learning model coupled with the synthetic minority oversampling technique and a tree-based gap-filling method. Our model has an averaged out-of-bag cross-validation R2 of 0.83 for different years, which is comparable to those of other studies, but improves its performance at high pollution levels and fills the gaps in missing AOD on daily scale. The full coverage and near real-time updates of the daily PM2.5 data allow us to track the day-to-day variations in PM2.5 concentrations over China in a timely manner. The long-term records of PM2.5 data since 2000 will also support policy assessments and health impact studies. The TAP PM2.5 data are publicly available through our website for sharing with the research and policy communities.
翻译:因此,全面覆盖高分辨率空气污染物数据集,及时更新和历史长期记录,对于支持研究和环境管理至关重要。在这里,我们首次开发了近实时空气污染物数据库,称为中国空气污染跟踪(TAP, tapdata.org),将来自多种数据来源的信息(包括地面测量、卫星检索、动态更新排放清单、可操作化学运输模型模拟和其他辅助数据)合并起来。10公里空间分辨率的每日全面覆盖的PM2.5数据是我们第一个接近实时的产品。TAP PM2.5是根据两阶段机器学习模型估算的,加上合成少数人过量采样技术和植树补树法。我们的模型将不同年份平均出包交叉验证R20.83,这与其他研究相似,但提高了高污染水平的绩效,填补了每日缺损的AD2.5天版数据差距。2000年全覆盖和接近实时更新的PM政策记录将使我们的日常数据更新到PM政策上的数据更新到2000年的每日数据更新。