To understand the behavior of large dynamical systems like transportation networks, one must often rely on measurements transmitted by a set of sensors, for instance individual vehicles. Such measurements are likely to be incomplete and imprecise, which makes it hard to recover the underlying signal of interest.Hoping to quantify this phenomenon, we study the properties of a partially-observed state-space model. In our setting, the latent state $X$ follows a high-dimensional Vector AutoRegressive process $X_t = \theta X_{t-1} + \varepsilon_t$. Meanwhile, the observations $Y$ are given by a noise-corrupted random sample from the state $Y_t = \Pi_t X_t + \eta_t$. Several random sampling mechanisms are studied, allowing us to investigate the effect of spatial and temporal correlations in the distribution of the sampling matrices $\Pi_t$.We first prove a lower bound on the minimax estimation error for the transition matrix $\theta$. We then describe a sparse estimator based on the Dantzig selector and upper bound its non-asymptotic error, showing that it achieves the optimal convergence rate for most of our sampling mechanisms. Numerical experiments on simulated time series validate our theoretical findings, while an application to open railway data highlights the relevance of this model for public transport traffic analysis.
翻译:要理解运输网络等大型动态系统的行为,人们必须经常依赖由一组传感器(例如个别车辆)传送的测量数据。这类测量数据可能不完整和不精确,因此很难恢复基本关注信号。为了量化这一现象,我们要研究部分观测到的状态空间模型的属性。在我们的设置中,潜值状态X$遵循高维矢量自动递增进程$X_t=\ta X ⁇ t-1}+\varepsilon_t$。同时,从州Y_t=\\Pi_t X_t+\eta_t$中随机抽取的样本提供了美元。我们研究了一些随机抽样的州空间模型模型机制,使我们能够调查在抽样矩阵分布中的空间和时间相关性 $\Pi_t_t$。我们首先证明,在过渡矩阵 $\theta$的开放估算误差上,我们然后描述一个稀疏的模型,根据丹采尔的模型模型模型模型模型, 显示我们最精确的模型模型模型模型模型, 显示我们最精确的模型模型模型模拟的模型, 和最精确的模型化的模型数据序列, 将显示我们最精确的模型用于最佳的模型模拟的模型模拟的模型模拟的模型, 测试的模型, 测试的模型的模型的模型的模型的模型, 和最精确的模型的模型的模型的模型的模型的模型将显示, 测试结果的模型将显示它显示我们的模型的模型的模型的模型显示, 的模型显示它显示, 的模型显示, 的模型将达到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确到最精确的模型。