计算原始eBWT 更快、简单、内存较少 (Computing the original eBWT faster, simpler, and with less memory) - 专知论文

会员服务 ·

0

原点 · Less · TCS · SimPLe · 相互独立的 ·

2021 年 6 月 21 日

Computing the original eBWT faster, simpler, and with less memory

翻译：计算原始eBWT 更快、简单、内存较少

Christina Boucher,Davide Cenzato,Zsuzsanna Lipták,Massimiliano Rossi,Marinella Sciortino

from arxiv, 20 pages, 5 figures, 1 table

Mantaci et al. [TCS 2007] defined the eBWT to extend the definition of the BWT to a collection of strings, however, since this introduction, it has been used more generally to describe any BWT of a collection of strings and the fundamental property of the original definition (i.e., the independence from the input order) is frequently disregarded. In this paper, we propose a simple linear-time algorithm for the construction of the original eBWT, which does not require the preprocessing of Bannai et al. [CPM 2021]. As a byproduct, we obtain the first linear-time algorithm for computing the BWT of a single string that uses neither an end-of-string symbol nor Lyndon rotations. We combine our new eBWT construction with a variation of prefix-free parsing to allow for scalable construction of the eBWT. We evaluate our algorithm (pfpebwt) on sets of human chromosomes 19, Salmonella, and SARS-CoV2 genomes, and demonstrate that it is the fastest method for all collections, with a maximum speedup of 7.6x on the second best method. The peak memory is at most 2x larger than the second best method. Comparing with methods that are also, as our algorithm, able to report suffix array samples, we obtain a 57.1x improvement in peak memory. The source code is publicly available at https://github.com/davidecenzato/PFP-eBWT.

翻译：Mantaci et al. [TCS 2007] 定义了eBWT, 将BWT的定义扩大到字符串的集合,然而,自本导言以来,它被更普遍地用来描述任何BWT的字符串集合和原始定义的基本属性(即不受输入顺序的限制)经常被忽略。在本文件中,我们为最初eBWT的构建提出了一个简单的线性时间算法,而最初eBBWT不需要Bannai et al.[CPPM 2021]。作为副产品,我们获得了计算BWT的单个字符串的首次线性时间算法,而BWT既不使用断符号,也不使用Lyndon的旋转。我们的新eBWT的构造与不使用前缀的拼法的变异异,以便可以缩放 eBWT。我们评估了关于人类染色体19号、Salmoniella和SAS-CV2基因组的算法, 显示这是所有收藏的最快方法, 最高级的SBWT/CVx 样本是我们最先进的S 7.6的第二可获取的存储式。

0

相关内容

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

专知会员服务

26+阅读 · 2020年10月9日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

知识图谱本体结构构建论文合集

知识图谱本体结构构建论文合集

专知会员服务

108+阅读 · 2019年10月9日

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Does Preprocessing help in Fast Sequence Comparisons?

Arxiv

0+阅读 · 2021年8月20日

Improved Linear-Time Algorithm for Computing the $4$-Edge-Connected Components of a Graph

Arxiv

0+阅读 · 2021年8月19日

Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization

Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization

Arxiv

1+阅读 · 2021年8月18日

Schrödinger PCA: On the Duality between Principal Component Analysis and Schrödinger Equation

Arxiv

0+阅读 · 2021年8月18日

Neuromorphic Computing for Content-based Image Retrieval

Arxiv

0+阅读 · 2021年8月17日

Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Arxiv

0+阅读 · 2021年8月17日

You May Not Need Attention

Arxiv

4+阅读 · 2018年10月31日

Few Shot Learning with Simplex

Few Shot Learning with Simplex

Arxiv

5+阅读 · 2018年7月27日

Face Detection Using Improved Faster RCNN

Arxiv

6+阅读 · 2018年2月6日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

【NeurIPS 2020】生成对抗性模仿学习的f-Divergence

专知会员服务

26+阅读 · 2020年10月9日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

知识图谱本体结构构建论文合集

知识图谱本体结构构建论文合集

专知会员服务

108+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军特种作战条令》最新102页

《洛克希德SR-71“黑鸟”侦察机动力系统》21页slides

美空军作战实验室通过人工智能和指挥控制技术创新推进杀伤链

《指挥控制能力分析方法论》最新报告

相关资讯

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Does Preprocessing help in Fast Sequence Comparisons?

Arxiv

0+阅读 · 2021年8月20日

Improved Linear-Time Algorithm for Computing the $4$-Edge-Connected Components of a Graph

Arxiv

0+阅读 · 2021年8月19日

Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization

Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization

Arxiv

1+阅读 · 2021年8月18日

Schrödinger PCA: On the Duality between Principal Component Analysis and Schrödinger Equation

Arxiv

0+阅读 · 2021年8月18日

Neuromorphic Computing for Content-based Image Retrieval

Arxiv

0+阅读 · 2021年8月17日

Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Arxiv

0+阅读 · 2021年8月17日

You May Not Need Attention

Arxiv

4+阅读 · 2018年10月31日

Few Shot Learning with Simplex

Few Shot Learning with Simplex

Arxiv

5+阅读 · 2018年7月27日

Face Detection Using Improved Faster RCNN

Arxiv

6+阅读 · 2018年2月6日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

微信扫码咨询专知VIP会员