We train deep learning models on thousands of galaxy catalogues from the state-of-the-art hydrodynamic simulations of the CAMELS project to perform regression and inference. We employ Graph Neural Networks (GNNs), architectures designed to work with irregular and sparse data, like the distribution of galaxies in the Universe. We first show that GNNs can learn to compute the power spectrum of galaxy catalogues with a few percent accuracy. We then train GNNs to perform likelihood-free inference at the galaxy-field level. Our models are able to infer the value of $\Omega_{\rm m}$ with a $\sim12\%-13\%$ accuracy just from the positions of $\sim1000$ galaxies in a volume of $(25~h^{-1}{\rm Mpc})^3$ at $z=0$ while accounting for astrophysical uncertainties as modelled in CAMELS. Incorporating information from galaxy properties, such as stellar mass, stellar metallicity, and stellar radius, increases the accuracy to $4\%-8\%$. Our models are built to be translational and rotational invariant, and they can extract information from any scale larger than the minimum distance between two galaxies. However, our models are not completely robust: testing on simulations run with a different subgrid physics than the ones used for training does not yield as accurate results.
翻译:我们从CAMELS项目最先进的流体动力模拟中培训数千个星系目录的深学习模型,以进行回归和推断。我们使用石形神经网络(GNNS),这些建筑设计用于使用非正常和稀少的数据,如宇宙星系分布。我们首先显示,GNNS可以学习以美元=0美元计算星系目录的能量范围,同时计算出以几个百分点精确度为模型的星系目录中的天体物理不确定性。然后我们培训GNNS,以便在星系场一级进行无概率的推断。我们的模型能够用美元来推断$\Omega ⁇ rm}的值,从1美元=12 ⁇ -13$的精确度从1 000美元的星系的位置上推算出。 我们的模型不是以25~h ⁇ -1 ⁇ mmmmmmmmmmmmmmmmmmmm/c}为单位,而是以美元计算出3$z=0美元,同时计算出在CAMELS模型中模拟的天体物理不确定性。从星系特性(如星系质量质量质量质量质量、星体金属和天体半径半径半径半径)等信息,将准确度提高到4 ⁇ 的精确值增加到8-8。我们模型的模型的模型是不使用最高级的模型,不易的模型,不使用。