Motivation: RNA design aims to find at least one sequence that folds with the highest probability into a designated target structure, but some structures are undesignable in the sense that no sequence folds into them. Identifying undesignable structures is useful in delineating and understanding the limit of RNA designability, but has received little attention until recently. In addition, existing methods on undesignability are not scalable and not interpretable. Results: We introduce a novel graph representation and a new general algorithmic framework to efficiently identify undesignable motifs in a secondary structure. The proposed algorithm enumerates minimal motifs based on the loop-pair graph representation of a structure and establishes the undesignability of a motif by proposing rival substructure(s). Our work can also identify unique minimum undesignable motifs across different structures. Our implemented algorithms successfully identify 26 unique minimum undesignable motifs among 18 undesignable puzzles from the benchmark Eterna100. Additionally, our algorithm is so efficient that it scales to natural structures of 16S and 23S Ribosomal RNAs (about 1,500 and 3,000 nucleotides, resp.), and finds all of those structures in the widely used ArchiveII database to be undesignable, with 73 unique minimum undesignable motifs, under the standard Turner energy model in ViennaRNA.
翻译:暂无翻译