与部分可观测国家和隐藏动态动态进行动态差异选择估计 (Dynamic Discrete Choice Estimation with Partially Observable States and Hidden Dynamics)

Dynamic discrete choice models are used to estimate the intertemporal preferences of an agent as described by a reward function based upon observable histories of states and implemented actions. However, in many applications, such as reliability and healthcare, the system state is only partially observable or hidden (e.g., the level of deterioration of an engine, the condition of a disease), and the decision maker only has access to information imperfectly correlated with the true value of the hidden state. In this paper, we consider the estimation of a dynamic discrete choice model with state variables and system dynamics hidden to both the agent and the modeler, thus generalizing the model in Rust(1987) to partially observable cases. We examine the structural properties of the model and prove that this model is still identifiable if the cardinality of the state space, the discount factor, the distribution of random shocks, and the rewards for a given (reference) action are given. We analyze both theoretically and numerically the potential mis-specification errors that may be incurred when the Rust's model is improperly used in partially observable settings. We further apply the model to a subset of dataset in Rust(1987) for bus engine mileage and replacement decisions. The results show that our model can improve model fit as measured by the $\log$-likelihood function by $17.73\%$ and the $\log$-likelihood ratio test shows that our model statistically outperforms the Rust's model. Interestingly, our hidden state model also reveals an economically meaningful route assignment behavior in the dataset which was hitherto ignored, i.e. routes with lower mileage are assigned to buses believed to be in worse condition.

翻译：使用动态离散选择模型来估计一个代理商的时际偏好,这是根据可观察的州历史和已执行的行动的奖励函数所描述的。然而,在许多应用中,例如可靠性和保健,系统状态只是部分可见或隐藏(例如引擎的退化程度、疾病的状况),而决策者只能获得与隐藏状态的真正价值不完全相关的信息。在本文中,我们考虑对一个动态离散选择模型的估计,该模型的状态变量和系统动态隐藏于代理商和模型中,从而将Rust(1987年)中的模型推广到部分可观察的运行行为。我们检查模型的结构属性,并证明如果给出了状态空间的基度、折扣系数、随机冲击的分布以及给定(参照)动作的奖励,这一模型仍然可以识别。我们从理论上和数字上分析了当鲁斯特的模型在部分可观察环境中被不适当地使用时可能发生的错误。我们进一步应用该模型在Rust(1987年)Restrial Restrial Restrial Restride) 中,我们所测量的Restride- destrifortypeatal exal redustry exal remodeal redudududududududududududustral。我们所测量的Rust.