BAUM: improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach

PubMed

Berlin

K.

et al. (

2015

)

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing

.

Nat. Biotechnol.

,

33

,

623

–

630

.

Blattner

F.R.

et al. (

1997

)

The complete genome sequence of Escherichia coli K-12

.

Science

,

277

,

1453

–

1462

.

Boetzer

M.

et al. (

2011

)

Scaffolding pre-assembled contigs using SSPACE

.

Bioinformatics

,

27

,

578

–

579

.

Boetzer

M.

,

Pirovano

W. (

2012

)

Toward almost closed genomes with GapFiller

.

Genome Biol.

,

13

,

R56

.

Bradnam

K.R.

et al. (

2013

)

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

.

Gigascience

,

2

,

10

.

Butler

J.

et al. (

2008

)

ALLPATHS: de novo assembly of whole-genome shotgun microreads

.

Genome Res.

,

18

,

810

–

820

.

Camacho

C.

et al. (

2009

)

BLAST+: architecture and applications

.

BMC Bioinformatics

,

10

,

421

.

Chakraborty

M.

et al. (

2016

)

Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage

.

Nucl. Acids Res.

,

44

,

e147

.

. http://software.broadinstitute.org/allpaths-lg/blog/?page_id=336.

Chen

S.

et al. (

2013

)

SEME: a fast mapper of Illumina sequencing reads with statistical evaluation

.

J. Comput. Biol.

,

20

,

847

–

860

.

Chin

C.S.

et al. (

2013

)

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data

.

Nat. Methods

,

10

,

563

–

569

.

Computational Research and Development Group

, t.B.I. ALLPATHS-LG FAQ.

2013

Eid

J.

et al. (

2009

)

Real-time DNA sequencing from single polymerase molecules

.

Science

,

323

,

133

–

138

.

Gao

S.

et al. (

2011

)

Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences

.

J. Comput. Biol.

,

18

,

1681

–

1691

.

Gnerre

S.

et al. (

2011

)

High-quality draft assemblies of mammalian genomes from massively parallel sequence data

.

Proc. Natl. Acad. Sci. USA

,

108

,

1513

–

1518

.

Green

S.J.

et al. (

1994

) PHRAP documentation. http://www.phrap.org (22 January 2015, date last accessed).

Gurevich

A.

et al. (

2013

)

QUAST: quality assessment tool for genome assemblies

.

Bioinformatics

,

29

,

1072

–

1075

.

Huang

W.

et al. (

2012

)

ART: a next-generation sequencing read simulator

.

Bioinformatics

,

28

,

593

–

594

.

Idury

R.M.

,

Waterman

M.S.

(

1995

)

A new algorithm for DNA sequence assembly

.

J. Comput. Biol.

,

2

,

291

–

306

.

Jeffares

D.C.

et al. (

2017

)

Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast

.

Nat. Commun.

,

8

,

14061

.

Koren

S.

et al. (

2017

)

Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation

.

Genome Res.

,

27

,

722

–

736

.

Langmead

B.

,

Salzberg

S.L.

(

2012

)

Fast gapped-read alignment with Bowtie 2

.

Nat. Methods

,

9

,

357

–

359

.

Li

H.

,

Durbin

R.

(

2009

)

Fast and accurate short read alignment with Burrows-Wheeler transform

.

Bioinformatics

,

25

,

1754

–

1760

.

Li

L.M.

(

2005

)

An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints

.

Comput. Stat. Data Anal.

,

48

,

717

–

734

.

Luo

R.

et al. (

2012

)

SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler

.

Gigascience

,

1

,

18

.

Maccallum

I.

et al. (

2009

)

ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads

.

Genome Biol.

,

10

,

R103

.

Metzker

M.L.

(

2010

)

Sequencing technologies—the next generation

.

Nat. Rev. Genet.

,

11

,

31

–

46

.

Myers

E.W.

(

1995

)

Toward simplifying and accurately formulating fragment assembly

.

J. Comput. Biol.

,

2

,

275

–

290

.

Myers

E.W.

et al. (

2000

)

A whole-genome assembly of drosophila

.

Science

,

287

,

2196

–

2204

.

Newbler

Roche

(

2014

)

454-Life-Sciences

.

Otto

T.D.

et al. (

2010

)

Iterative correction of reference nucleotides (iCORN) using second generation sequencing technology

.

Bioinformatics

,

26

,

1704

–

1707

.

Otto

T.D.

et al. (

2011

)

RATT: rapid annotation transfer tool

.

Nucl. Acids Res.

,

39

,

e57

.

Peng

Y.

et al. (

2010

)

IDBA—a practical iterative de Bruijn graph de novo assembler

.

Res. Comput. Mol. Biol., Proc.

,

6044

,

426

–

440

.

Pevzner

P.A.

et al. (

2001

)

An Eulerian path approach to DNA fragment assembly

.

Proc. Natl. Acad. Sci. USA

,

98

,

9748

–

9753

.

Phillippy

A.M.

(

2017

)

New advances in sequence assembly

.

Genome Res.

,

27

,

xi

–

xiii

.

Roberts

R.J.

et al. (

2013

)

The advantages of SMRT sequencing

.

Genome Biol.

,

14

,

405

.

Schatz

M.C.

et al. (

2010

)

Assembly of large genomes using second-generation sequencing

.

Genome Res.

,

20

,

1165

–

1173

.

Shi

W.Y.

et al. (

2017

)

The combination of direct and paired link graphs can boost repetitive genome assembly

.

Nucl. Acids Res.

,

45

, e43.

Simão

F.A.

et al. (

2015

)

BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs

.

Bioinformatics

,

31

,

3210

–

3212

.

Simpson

J.T.

et al. (

2009

)

ABySS: a parallel assembler for short read sequence data

.

Genome Res.

,

19

,

1117

–

1123

.

Smith

T.F.

,

Waterman

M.S.

(

1981

)

Identification of common molecular subsequences

.

J. Mol. Biol.

,

147

,

195

–

197

.

Sovic

I.

et al. (

2016

)

Evaluation of hybrid and non-hybrid methods for de novo assembly of nanopore reads

.

Bioinformatics

,

32

,

2582

–

2589

.

Swain

M.T.

et al. (

2012

)

A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs

.

Nat. Protoc.

,

7

,

1260

–

1284

.

Treangen

T.J.

,

Salzberg

S.L.

(

2011

)

Repetitive DNA and next-generation sequencing: computational challenges and solutions

.

Nat. Rev. Genet.

,

13

,

36

–

46

.

Tsai

I.J.

et al. (

2010

)

Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps

.

Genome Biol.

,

11

,

R41

.

Warren

R.L.

et al. (

2007

)

Assembling millions of short DNA sequences using SSAKE

.

Bioinformatics

,

23

,

500

–

501

.

Wick

R.R.

et al. (

2017

)

Unicycler: resolving bacterial genome assemblies from short and long sequencing reads

.

PLoS Comput. Biol.

,

13

, e1005595.

Xiao

C.L.

et al. (

2017

)

MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads

.

Nat. Methods

,

14

,

1072

–

1074

.

Zerbino

D.R.

,

Birney

E.

(

2008

)

Velvet: algorithms for de novo short read assembly using de Bruijn graphs

.

Genome Res.

,

18

,

821

–

829

.

Zhang

Y.

et al. (

2015

)

Genome and comparative transcriptomics of African wild rice Oryza longistaminata provide insights into molecular mechanism of rhizomatousness and self-incompatibility

.

Mol. Plant

,

8

,

1683

–

1686

.

Zhao

F.Q.

et al. (

2008

)

A new pheromone trail-based genetic algorithm for comparative genome assembly

.

Nucl. Acids Res.

,

36

,

3455

–

3462

.