MegaHit

An ultra-fast single-node solution for large and complex metagenomics assembly


MegaHit is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252 Gbps in 44.1 and 99.6 h on a single computing node with and without a graphics processing unit, respectively. MegaHit assembles the data as a whole, i.e. no pre-processing like partitioning and normalization was needed. When compared with previous methods on assembling the soil data, MegaHit generated a three-time larger assembly, with longer contig N50 and average contig length; furthermore, 55.8% of the reads were aligned to the assembly, giving a fourfold improvement.



Publications:

  • Dinghua Li, Ruibang Luo, Chi-Man Liu, Chi-Ming Leung, Hing-Fung Ting, Kunihiko Sadakane, Hiroshi Yamashita, Tak-Wah Lam, MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices, Methods, volume 102, 2016, pages 3-11. doi: 10.1016/j.ymeth.2016.02.020
  • Dinghua Li, Chi-Man Liu, Ruibang Luo, Kunihiko Sadakane, Tak-Wah Lam, MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph, Bioinformatics 2015, 31 (10): 1674-1676. doi: 10.1093/bioinformatics/btv033