In this article we describe a study aimed at estimating job vacancy statistics, in particular the number of entities with at least one vacancy. To achieve this goal, we propose an alternative approach to the methodology exploiting survey data, which is based solely on data from administrative registers and online sources and relies on dual system estimation (DSE). As these sources do not cover the whole reference population and the number of units appearing in all datasets is small, we have developed a DSE approach for negatively dependent sources based on a recent work by Chatterjee and Bhuyan (2020). To achieve the main goal we conducted a thorough data cleaning procedure in order to remove out-of-scope units, identify entities from the target population, and link them by identifiers to minimize linkage errors. We verified the effectiveness and sensitivity of the proposed estimator in simulation studies. From a practical point of view, our results show that the current vacancy survey in Poland underestimates the number of entities with at least one vacancy by about 10-15%. The main reasons for this discrepancy are non-sampling errors due to non-response and under-reporting, which is identified by comparing survey data with administrative data.
翻译:为了实现这一目标,我们提出了一个利用调查数据的方法替代方法,该方法仅以行政登记册和在线来源的数据为基础,并依靠双重系统估计(DSE)。由于这些来源并不涵盖所有参考人口,所有数据集中出现的单位数量很小,我们根据查特杰和布扬(2020年)最近的工作结果,为负依赖来源制定了DSE方法,特别是至少有一个空缺的实体的数目。为了实现主要目标,我们采取了彻底的数据清理程序,以清除范围外单位,从目标人群中识别实体,并用识别器将其连接起来,以尽量减少联系错误。我们核实了模拟研究中拟议的估计数字的有效性和敏感性。从实际的角度来看,我们目前的波兰空缺调查显示,至少有一个空缺的实体数目低估了大约10-15%。这一差异的主要原因是,由于没有答复和报告不足,因此没有标出错误。通过将调查数据与行政数据进行比较而查明。