We consider a class of queries called durability prediction queries that arise commonly in predictive analytics, where we use a given predictive model to answer questions about possible futures to inform our decisions. Examples of durability prediction queries include "what is the probability that this financial product will keep losing money over the next 12 quarters before turning in any profit?" and "what is the chance for our proposed server cluster to fail the required service-level agreement before its term ends?" We devise a general method called Multi-Level Splitting Sampling (MLSS) that can efficiently handle complex queries and complex models -- including those involving black-box functions -- as long as the models allow us to simulate possible futures step by step. Our method addresses the inefficiency of standard Monte Carlo (MC) methods by applying the idea of importance splitting to let one "promising" sample path prefix generate multiple "offspring" paths, thereby directing simulation efforts toward more promising paths. We propose practical techniques for designing splitting strategies, freeing users from manual tuning. Experiments show that our approach is able to achieve unbiased estimates and the same error guarantees as standard MC while offering an order-of-magnitude cost reduction.
翻译:我们考虑的是一类问题,即常在预测分析中出现的耐久预测问题,我们使用一个特定的预测模型来回答关于可能的未来的疑问,以通报我们的决定。耐久预测问题的例子包括“这一金融产品在未来12个季度中继续亏损的可能性有多大?” 和“我们拟议的服务器集群在任期结束前能够让所需要的服务级协议失败的可能性有多大?” 我们设计了一个叫做“多层次分解抽样(MLSS)”的一般性方法,它可以有效地处理复杂的查询和复杂模型,包括黑箱功能的查询和复杂模型,只要模型允许我们一步步模拟可能的将来。只要这些模型允许我们一步地模拟可能的将来。 耐久预测询问的例子包括“ ” : “ 我们的方法解决标准蒙特卡洛( Monte Carlo) 方法效率低下的问题,方法是应用“ 分散重要性的概念,让一个“ 促进” 样本路径产生多重“ 偏离” 路径, 从而引导模拟努力走向更有前途的道路。我们提出了设计分裂战略的实用技术,让用户摆脱手动调整。实验表明我们的方法能够实现公正的估计, 和与标准的MC 降低成本的相同错误保证。