数据驱动清单政策无症状分析 (Asymptotic Analysis for Data-Driven Inventory Policies)

from arxiv, The authors plan to include the updated version into a research proposal. To avoid the possible inconvinence, the authors decided to remove the updated version for now

We study periodic review stochastic inventory control in the data-driven setting where the retailer makes ordering decisions based only on historical demand observations without any knowledge of the probability distribution of the demand. Since an (s, S)-policy is optimal when the demand distribution is known, we investigate the statistical properties of the data-driven (s, S)-policy obtained by recursively computing the empirical cost-to-go functions. This policy is inherently challenging to analyze because the recursion induces propagation of the estimation error backwards in time. In this work, we establish the asymptotic properties of this data-driven policy by fully accounting for the error propagation. First, we rigorously show the consistency of the estimated parameters by filling in some gaps (due to unaccounted error propagation) in the existing studies. In this setting, empirical process theory (EPT) cannot be directly applied to show asymptotic normality. To explain, the empirical cost-to-go functions for the estimated parameters are not i.i.d. sums due to the error propagation. Our main methodological innovation comes from an asymptotic representation for multi-sample U-processes in terms of i.i.d. sums. This representation enables us to apply EPT to derive the influence functions of the estimated parameters and to establish joint asymptotic normality. Based on these results, we also propose an entirely data-driven estimator of the optimal expected cost and we derive its asymptotic distribution. We demonstrate some useful applications of our asymptotic results, including sample size determination and interval estimation. The results from our numerical simulations conform to our theoretical analysis.lations conform to our theoretical analysis.

翻译：我们研究数据驱动环境中的定期检查库存控制,因为零售商仅根据历史需求观测做出定购决定,而没有了解需求的概率分布。由于(s,S)政策是已知需求分布时最理想的,我们调查通过反复计算经验成本对运行功能而获得的数据驱动(s,S)政策的统计属性。这一政策本身具有分析挑战性,因为循环会促使估算错误在时间上反向传播。在这项工作中,我们充分核算错误传播,从而确定这一数据驱动政策的非抽取特性。首先,我们严格显示估计参数的一致性,填补现有研究中的一些空白(由于未知错误的传播),因此我们调查数据驱动(s,S)政策获得的数据驱动(s,S)政策的统计属性(EPT)的统计属性,无法直接用于显示随机正常的正常。解释,因为循环周期的计算结果不是i.d.总和错误传播。我们的主要方法创新来自于对多度传播的度的度的度的度分布。我们从多度分布参数的测量中,包括未知的传播的传播,我们从直径的数值分析到我们的数据分析结果,从我们的数据分析,从我们的数据分析到直径推到直到我们的数据分析,从我们的数据分析,我们的数据分析, 直到直到直到直到直到直到我们的数据分析。