pith. sign in

arxiv: 1608.01031 · v2 · pith:EXKVQZ25new · submitted 2016-08-02 · 💻 cs.DS · q-bio.GN

Meraculous2: fast accurate short-read assembly of large polymorphic genomes

classification 💻 cs.DS q-bio.GN
keywords assembliesmeraculous2assemblyimprovedaccuracydatagenomehuman
0
0 comments X
read the original abstract

We present Meraculous2, an update to the Meraculous short-read assembler that includes (1) handling of allelic variation using "bubble" structures within the de Bruijn graph, (2) improved gap closing, and (3) an improved scaffolding algorithm that produces more complete assemblies without compromising scaffolding accuracy. The speed and bandwidth efficiency of the new parallel implementation have also been substantially improved, allowing the assembly of a human genome to be accomplished in 24 hours on the JGI/NERSC Genepool system. To highlight the features of Meraculous2 we present here the assembly of the diploid human genome NA12878, and compare it with previously published assemblies of the same data using other algorithms. The Meraculous2 assemblies are shown to have better completeness, contiguity, and accuracy than other published assemblies for these data. Practical considerations including pre-assembly analyses of polymorphism and repetitiveness are described.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.