Actuaries use predictive modeling techniques to assess the loss cost on a contract as a function of observable risk characteristics. State-of-the-art statistical and machine learning methods are not well equipped to handle hierarchically structured risk factors with a large number of levels. In this paper, we demonstrate the construction of a data-driven insurance pricing model when hierarchically structured risk factors, contract-specific as well as externally collected risk factors are available. We examine the pricing of a workers' compensation insurance product with a hierarchical credibility model (Jewell, 1975), Ohlsson's combination of a generalized linear and a hierarchical credibility model (Ohlsson, 2008) and mixed models. We compare the predictive performance of these models and evaluate the effect of the distributional assumption on the target variable by comparing linear mixed models with Tweedie generalized linear mixed models. For our case-study the Tweedie distribution is well suited to model and predict the loss cost on a contract. Moreover, incorporating contract-specific risk factors in the predictive model improves the performance and allows for a improved risk differentiation in our workers' compensation insurance portfolio.
翻译:精算师利用预测模型技术评估合同损失成本,视可观测风险特性而定。最先进的统计和机器学习方法没有很好的条件处理等级结构化的风险因素和大量等级。在本文件中,我们展示了在存在等级结构化风险因素、合同特定因素以及外部收集的风险因素时,数据驱动保险定价模型的构建情况。我们用等级可信度模型(Jewell,1975年)审查了工人补偿保险产品的价格(Holsson),将一般线性风险模型和等级可信度模型(Ohlsson,2008年)和混合模型结合起来。我们比较了这些模型的预测性业绩,并通过比较线性混合模型和Tweedie通用线性混合模型,评估分布性假设对目标变量的影响。对于我们的案例研究来说,Tweedie分布非常适合对合同损失成本进行模型和预测。此外,将合同特定风险因素纳入预测模型可以改善业绩,并允许改进我们工人补偿保险组合的风险差异。