Given the vital importance of search engines to find digital information, there has been much scientific attention on how users interact with search engines, and how such behavior can be modeled. Many models on user - search engine interaction, which in the literature are known as click models, come in the form of Dynamic Bayesian Networks. Although many authors have used the resemblance between the different click models to derive estimation procedures for these models, in particular in the form of expectation maximization (EM), still this commonly requires considerable work, in particular when it comes to deriving the E-step. What we propose in this paper, is that this derivation is commonly unnecessary: many existing click models can in fact, under certain assumptions, be optimized as they were Input-Output Hidden Markov Models (IO-HMMs), for which the forward-backward equations immediately provide this E-step. To arrive at that conclusion, we will present the Generalized Cascade Model (GCM) and show how this model can be estimated using the IO-HMM EM framework, and provide two examples of how existing click models can be mapped to GCM. Our GCM approach to estimating click models has also been implemented in the gecasmo Python package.
翻译:鉴于搜索引擎对寻找数字信息的极端重要性,对于用户如何与搜索引擎互动以及如何可以模拟这种行为,人们已有很多科学关注。关于用户-搜索引擎互动的许多模型(文献中称为点击模型)都以动态巴伊西亚网络的形式出现。虽然许多作者使用不同的点击模型的相似性来为这些模型得出估计程序,特别是预期最大化(EM),但通常这仍需要大量的工作,特别是在产生E级步骤时。我们在本文件中提出的建议是,这种衍生通常没有必要:根据某些假设,许多现有的点击模型事实上可以优化,因为它们是输入-输出隐藏马尔科夫模型(IO-HMMMMM),为此,前向方方方方公式提供了E级模型。为了得出这一结论,我们将介绍通用卡萨德模型(GCM),并展示如何利用IO-HMEM框架来估计这一模型,并举例说明现有的点击模型可如何向GCM映射。我们的GCM软件包在点击模型时也实施了。