利用自动电算器生成统一冷藏空间的机制 (A Mechanism for Producing Aligned Latent Spaces with Autoencoders)

Aligned latent spaces, where meaningful semantic shifts in the input space correspond to a translation in the embedding space, play an important role in the success of downstream tasks such as unsupervised clustering and data imputation. In this work, we prove that linear and nonlinear autoencoders produce aligned latent spaces by stretching along the left singular vectors of the data. We fully characterize the amount of stretching in linear autoencoders and provide an initialization scheme to arbitrarily stretch along the top directions using these networks. We also quantify the amount of stretching in nonlinear autoencoders in a simplified setting. We use our theoretical results to align drug signatures across cell types in gene expression space and semantic shifts in word embedding spaces.

翻译：维系的隐性空间, 输入空间中有意义的语义变化与嵌入空间的翻译相对应, 在下游任务的成功方面起着重要作用, 例如未受监督的集群和数据估算。在这项工作中, 我们证明线性和非线性自动电解码器通过延展数据左单向矢量来生成匹配的隐性空间。我们充分描述线性自动转换器的伸展量, 并提供一个初始化计划, 用这些网络任意沿着顶部方向延伸。我们还量化了非线性自动转换器在简化环境中的伸展量。我们使用理论结果来将基因表达空间和文字嵌入空间的语义变化中的细胞特征对齐。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

如何证明? 《Proofs: 长篇数学导论》硬核书，330页pdf

专知会员服务

121+阅读 · 2021年1月31日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日