pith. sign in

arxiv: 1512.01801 · v2 · pith:TP7MYUJQnew · submitted 2015-12-06 · 🧬 q-bio.GN

Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences

classification 🧬 q-bio.GN
keywords readsassembleassemblymappingminiasmminimapdataerror
0
0 comments X
read the original abstract

Motivation: Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10kbp in length, which have enabled high-quality genome assembly at an affordable cost. However, at present, long reads have an error rate as high as 10-15%. Complex and computationally intensive pipelines are required to assemble such reads. Results: We present a new mapper, minimap, and a de novo assembler, miniasm, for efficiently mapping and assembling SMRT and ONT reads without an error correction stage. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold C. elegans data in 9 minutes, orders of magnitude faster than the existing pipelines. We also introduce a pairwise read mapping format (PAF) and a graphical fragment assembly format (GFA), and demonstrate the interoperability between ours and current tools. Availability and implementation: https://github.com/lh3/minimap and https://github.com/lh3/miniasm Contact: hengli@broadinstitute.org

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.