将轮轮式DFFAs的线性时间最小化 (Linear-time Minimization of Wheeler DFAs) - 专知论文

会员服务 ·

0

可约的 · state-of-the-art · 极小点 · Alphabet · 线性的 ·

2021 年 11 月 3 日

Linear-time Minimization of Wheeler DFAs

翻译：将轮轮式DFFAs的线性时间最小化

Jarno Alanko,Nicola Cotumaccio,Nicola Prezza

Wheeler DFAs (WDFAs) are a sub-class of finite-state automata which is playing an important role in the emerging field of compressed data structures: as opposed to general automata, WDFAs can be stored in just $\log\sigma + O(1)$ bits per edge, $\sigma$ being the alphabet's size, and support optimal-time pattern matching queries on the substring closure of the language they recognize. An important step to achieve further compression is minimization. When the input $\mathcal A$ is a general deterministic finite-state automaton (DFA), the state-of-the-art is represented by the classic Hopcroft's algorithm, which runs in $O(|\mathcal A|\log |\mathcal A|)$ time. This algorithm stands at the core of the only existing minimization algorithm for Wheeler DFAs, which inherits its complexity. In this work, we show that the minimum WDFA equivalent to a given input WDFA can be computed in linear $O(|\mathcal A|)$ time. When run on de Bruijn WDFAs built from real DNA datasets, an implementation of our algorithm reduces the number of nodes from 14% to 51% at a speed of more than 1 million nodes per second.

翻译：Wheeler DFAs (WDFAs) 是一个在压缩数据结构的新兴领域发挥重要作用的有限状态自动自动数据小分类: 与一般自动数据相比, WDFA 可以用纯$\log\sigma + O(1)美元比特/ 边缘存储, $\sigma$是字母的大小, 支持对所识别语言的子字符串关闭进行最佳时间匹配查询。进一步压缩的一个重要步骤是最小化。当输入 $\ mathcal A$ 是一般确定性固定状态自动数据( DFA) 时, 状态由经典的Hopcroft 算法代表, 以$( mascal Açálog + O(1)美元美元/ mathcal A ⁇ ) 时间存储。此算法是目前唯一最起码最短时间匹配其所识别语言的最小化算法的核心, 并在此工作中, 我们显示, 相当于给WDFA的最小值相当于WDFA的最小值可以用直线 $O (Zmacal A\) $ 51 a de de de max more time

0

相关内容

可约的

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

《Hands-On Machine Learning with Scikit-Learn and TensorFlow》Scikit-Learn与TensorFlow机器学习实用指南

《Hands-On Machine Learning with Scikit-Learn and TensorFlow》Scikit-Learn与TensorFlow机器学习实用指南

专知会员服务

65+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

已删除

将门创投

5+阅读 · 2019年6月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Gaussian Process Regression in the Flat Limit

Arxiv

0+阅读 · 2022年1月10日

Improved Decoding of Expander Codes

Arxiv

0+阅读 · 2022年1月10日

The Efficiency of the ANS Entropy Encoding

Arxiv

0+阅读 · 2022年1月7日

Complexity of Source-Sink Monotone 2-Parameter Min Cut

Arxiv

0+阅读 · 2022年1月6日

A note on efficient minimum cost adjustment sets in causal graphical models

Arxiv

0+阅读 · 2022年1月6日

Convergence and Complexity of Stochastic Block Majorization-Minimization

Arxiv

0+阅读 · 2022年1月5日

Polyline Simplification under the Local Fréchet Distance has Subcubic Complexity in 2D

Arxiv

0+阅读 · 2022年1月4日

Time and space complexity of deterministic and nondeterministic decision trees

Arxiv

0+阅读 · 2022年1月4日

Escape saddle points by a simple gradient-descent based algorithm

Arxiv

4+阅读 · 2021年11月28日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

《Hands-On Machine Learning with Scikit-Learn and TensorFlow》Scikit-Learn与TensorFlow机器学习实用指南

《Hands-On Machine Learning with Scikit-Learn and TensorFlow》Scikit-Learn与TensorFlow机器学习实用指南

专知会员服务

65+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

已删除

将门创投

5+阅读 · 2019年6月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Gaussian Process Regression in the Flat Limit

Arxiv

0+阅读 · 2022年1月10日

Improved Decoding of Expander Codes

Arxiv

0+阅读 · 2022年1月10日

The Efficiency of the ANS Entropy Encoding

Arxiv

0+阅读 · 2022年1月7日

Complexity of Source-Sink Monotone 2-Parameter Min Cut

Arxiv

0+阅读 · 2022年1月6日

A note on efficient minimum cost adjustment sets in causal graphical models

Arxiv

0+阅读 · 2022年1月6日

Convergence and Complexity of Stochastic Block Majorization-Minimization

Arxiv

0+阅读 · 2022年1月5日

Polyline Simplification under the Local Fréchet Distance has Subcubic Complexity in 2D

Arxiv

0+阅读 · 2022年1月4日

Time and space complexity of deterministic and nondeterministic decision trees

Arxiv

0+阅读 · 2022年1月4日

Escape saddle points by a simple gradient-descent based algorithm

Arxiv

4+阅读 · 2021年11月28日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

微信扫码咨询专知VIP会员