Valuing residential property is inherently complex, requiring consideration of numerous environmental, economic, and property-specific factors. These complexities present significant challenges for automated valuation models (AVMs), which are increasingly used to provide objective assessments for property taxation and mortgage financing. The challenge of obtaining accurate and objective valuations for properties at a country level, and not just within major cities, is further compounded by the presence of multiple localised submarkets-spanning urban, suburban, and rural contexts-where property features contribute differently to value. Existing AVMs often struggle in such settings: traditional hedonic regression models lack the flexibility to capture spatial variation, while advanced machine learning approaches demand extensive datasets that are rarely available. In this article, we address these limitations by developing a robust statistical framework for property valuation in the Irish housing market. We segment the country into six submarkets encompassing cities, large towns, and rural areas, and employ a generalized additive model that captures non-linear effects of property characteristics while allowing feature contributions to vary across submarkets. Our approach outperforms both machine learning-based and traditional hedonic regression models, particularly in data-sparse regions. In out-of-sample validation, our model achieves R-squared values of 0.70, 0.84, and 0.83 for rural areas, towns, and Dublin, respectively, compared to 0.52, 0.71, and 0.82 from a random forest benchmark. Furthermore, the temporal dynamics of our model align closely with reported inflation figures for the study period, providing additional validation of its accuracy.
翻译:住宅房地产估值本质上具有复杂性,需要综合考虑众多环境、经济及物业自身特性因素。这些复杂性为自动估值模型带来了显著挑战,而此类模型正日益广泛地应用于房地产税基评估与抵押贷款融资的客观估值。在全国层面(而不仅限于主要城市)获取准确客观的物业估值,因存在多个本地化子市场(涵盖城市、郊区及农村等不同情境)而更为困难——不同子市场中物业特征对价值的影响机制存在差异。现有自动估值模型在此类场景中常显不足:传统特征价格回归模型缺乏捕捉空间异质性的灵活性,而先进的机器学习方法则需要难以获取的大规模数据集。本文针对爱尔兰住房市场,构建了一个稳健的房地产估值统计框架以突破这些局限。我们将全国划分为涵盖城市、大型城镇及农村区域的六类子市场,采用广义可加模型以捕捉物业特征的非线性效应,同时允许特征贡献度随子市场变化。该方法在数据稀疏区域的表现尤为突出,其性能优于基于机器学习的模型与传统特征价格回归模型。在样本外验证中,本模型在农村地区、城镇及都柏林的R平方值分别达到0.70、0.84和0.83,而随机森林基准模型的对应结果仅为0.52、0.71和0.82。此外,模型的时间动态特征与研究期间公布的通胀数据高度吻合,进一步验证了其准确性。