遗传数据数据压缩的上下文盘点、模型集群和适应性 (Context binning, model clustering and adaptivity for data compression of genetic data) - 专知论文

会员服务 ·

0

簇 · MoDELS · 优化器 · 统计量 · INFORMS ·

2022 年 5 月 3 日

Context binning, model clustering and adaptivity for data compression of genetic data

翻译：遗传数据数据压缩的上下文盘点、模型集群和适应性

from arxiv, 7 pages, 7 figures

Rapid growth of genetic databases means huge savings from improvements in their data compression, what requires better inexpensive statistical models. This article proposes automatized optimizations e.g. of Markov-like models, especially context binning and model clustering. While it is popular to just remove low bits of the context, proposed context binning automatically optimizes such reduction as tabled: state=bin[context] determining probability distribution, this way extracting nearly all useful information also from very large contexts, into a relatively small number of states. The second proposed approach: model clustering uses k-means clustering in space of general statistical models, allowing to optimize a few models (as cluster centroids) to be chosen e.g. separately for each read. There are also briefly discussed some adaptivity techniques to include data non-stationarity.

翻译：基因数据库的快速增长意味着从数据压缩的改进中节省大量资金,这需要更廉价的统计模型。本条提出自动优化, 如Markov相似的模型, 特别是环境拆迁和模型群集。虽然只是删除低位环境位子很受欢迎, 拟议的环境拆迁会自动优化减排, 如: 国家=bin[Cext] 确定概率分布, 这样将几乎所有有用的信息也从非常大的背景中提取到相对较少的州。第二种拟议方法: 模型群集在一般统计模型的空间中使用 k 手段群集, 以便优化选择的少数模型( 作为分类的中间体 ), 例如为每读取而分别选择。还简要讨论了一些适应性技术, 以包括数据非静态。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

微波加热强化煤泥低温干燥的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

BnTR1提高甘蓝型油菜耐盐的分子机理分析及培育耐盐油菜新材料

国家自然科学基金

0+阅读 · 2012年12月31日

细晶Ni-Mn-Ga-Gd合金薄膜马氏体相变的尺寸效应与高温形状记忆特性

国家自然科学基金

0+阅读 · 2012年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

Shifted Compression Framework: Generalizations and Improvements

Arxiv

0+阅读 · 2022年6月21日

Differentially Private Multi-Party Data Release for Linear Regression

Arxiv

0+阅读 · 2022年6月18日

Decentralized adaptive clustering of deep nets is beneficial for client collaboration

Arxiv

0+阅读 · 2022年6月17日

Federated learning with incremental clustering for heterogeneous data

Federated learning with incremental clustering for heterogeneous data

Arxiv

0+阅读 · 2022年6月17日

On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning

Arxiv

0+阅读 · 2022年6月17日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《全球地缘政治环境中的反无人机系统互操作性》252页

《美国：为自动驾驶汽车铺平道路——未来出行已来》最新43页报告

基于大语言模型的智能体化软件问题解决：综述

星链与未来战争

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Shifted Compression Framework: Generalizations and Improvements

Arxiv

0+阅读 · 2022年6月21日

Differentially Private Multi-Party Data Release for Linear Regression

Arxiv

0+阅读 · 2022年6月18日

Decentralized adaptive clustering of deep nets is beneficial for client collaboration

Arxiv

0+阅读 · 2022年6月17日

Federated learning with incremental clustering for heterogeneous data

Federated learning with incremental clustering for heterogeneous data

Arxiv

0+阅读 · 2022年6月17日

On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning

Arxiv

0+阅读 · 2022年6月17日

相关基金

微波加热强化煤泥低温干燥的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

BnTR1提高甘蓝型油菜耐盐的分子机理分析及培育耐盐油菜新材料

国家自然科学基金

0+阅读 · 2012年12月31日

细晶Ni-Mn-Ga-Gd合金薄膜马氏体相变的尺寸效应与高温形状记忆特性

国家自然科学基金

0+阅读 · 2012年12月31日

压缩采样框架下的自适应稀疏信号感知与重建

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员