The Hayashi-Yoshida (\HY)-estimator exhibits an intrinsic, telescoping property that leads to an often overlooked computational bias, which we denote,formulaic or intrinsic bias. This formulaic bias results in data loss by cancelling out potentially relevant data points, the nonextant data points. This paper attempts to formalize and quantify the data loss arising from this bias. In particular, we highlight the existence of nonextant data points via a concrete example, and prove necessary and sufficient conditions for the telescoping property to induce this type of formulaic bias.Since this type of bias is nonexistent when inputs, i.e., observation times, $\Pi^{(1)} :=(t_i^{(1)})_{i=0,1,\ldots}$ and $\Pi^{(2)} :=(t_j^{(2)})_{j=0,1,\ldots}$, are synchronous, we introduce the (a,b)-asynchronous adversary. This adversary generates inputs $\Pi^{(1)}$ and $\Pi^{(2)}$ according to two independent homogenous Poisson processes with rates a>0 and b>0, respectively. We address the foundational questions regarding cumulative minimal (or least) average data point loss, and determine the values for a and b. We prove that for equal rates a=b, the minimal average cumulative data loss over both inputs is attained and amounts to 25\%. We present an algorithm, which is based on our theorem, for computing the exact number of nonextant data points given inputs $\Pi^{(1)}$ and $\Pi^{(2)}$, and suggest alternative methods. Finally, we use simulated data to empirically compare the (cumulative) average data loss of the (\HY)-estimator.
翻译:暂无翻译