Motivation: The branching processes model yields unevenly stochastically distributed data that consists of sparse and dense regions. The work tries to solve the problem of a precise evaluation of a parameter for this type of model. The application of the branching processes model to cancer cell evolution has many difficulties like high dimensionality and the rare appearance of a result of interest. Moreover, we would like to solve the ambitious task of obtaining the coefficients of the model reflecting the relationship of driver genes mutations and cancer hallmarks on the basis of personal data of variant allele frequencies. Results: The Approximate Bayesian computation method based on the Isolation kernel is designed. The method includes a transformation row data to a Hilbert space (mapping) and measures the similarity between simulation points and maxima weighted Isolation kernel mapping related to the observation point. Also, we designed a heuristic algorithm to find parameter estimation without gradient calculation and dimension-independent. The advantage of the proposed machine learning method is shown for multidimensional test data as well as for an example of cancer cell evolution.
翻译:动力学: 分流过程模型产生由稀有和密集区域组成的分布不均的数据。 工作试图解决精确评估这类模型参数的问题。 将分流过程模型应用于癌症细胞进化有许多困难, 比如高维度和罕见的感兴趣结果外观。 此外, 我们想要解决一项雄心勃勃的任务, 即根据变异全色频率的个人数据, 获得模型中反映驱动基因突变和癌症特征关系的系数。 结果: 设计了基于离子内核的近巴伊西亚计算方法。 该方法包括将行数据转换成希尔伯特空间( 绘图), 并测量模拟点和与观察点相关的最大加权离心内图之间的相似性。 此外, 我们设计了一种超自然算法, 以在不计算梯度和维度独立的情况下找到参数估计值。 所拟议的机器学习方法的优势被显示为多层面测试数据以及癌症细胞进化的例子。