Complete Genome of the Chitin-Degrading Bacterium, Paenibacillus xylanilyticus W4

Abstract Chitinases possess an extraordinary ability to directly hydrolyze highly insoluble chitin polymers to low-molecular-weight chito-oligomers, which possess particular biological functions, such as elicitor action and antitumor activity. A novel strain, Paenibacillus xylanilyticus W4, which was isolated from soil, showed strong chitin degradation activity. Here, we first reported the complete genome information of P. xylanilyticus. Paenibacillus xylanilyticus W4 contains a 5,532,141 bp single circular chromosome with 47.33% GC content. The genome contains 5,996 genes, including 39 rRNA- and 109 tRNA-coding genes. Phylogenetic analysis and Genome-to-Genome Distance revealed its taxonomic characterization into a separate family. Six glycoside hydrolase 18 (GH18) and 2 GH23 enzymes involved in chitin degradation. Although many of the chitinases were conserved in Paenibacillus, several GH18 chitinases share high similarity with Bacillus circulans. The genome information provided here could benefit for understanding the chitin-degrading properties of P. xylanilyticus as well as its potential application in biotechnological and pharmaceutical fields.


Introduction
Chitin is a major component of the cell walls of filamentous fungi and basidiomycetes, and is widely found in the outer shells of arthropod exoskeletons and crustaceans (Aam et al. 2010). The annual amount of chitin produced in nature exceeds 100 billion tons, and it is the second largest organic matter produced after cellulose (Das et al. 2010).
Chitinase can catalyze the hydrolysis of chitin, which is widely found in higher animals and plants, microorganisms, and certain viruses (Dahiya et al. 2006). Chitinase produced by microorganisms can degrade fungal cell walls and insect chitin coats and is applied in plant pest control (Kishore et al. 2005). Chitosan oligosaccharides formed by the degradation of chitin have antibacterial and anticancer properties and regulating immunity; thus, have broad application prospects in health products and pharmaceutical industry (Bhattacharya et al. 2007).
Many chitinase-producing species have been isolated from various environments, and their chitinases have been identified (Krithika and Chellaram 2016;Gasmi et al. 2019). Most known chitinases belong to GH18 and 19 families. GH18 chitinases exists widely in many organisms from microbes to human. Their catalytic domain has an absolutely conserved (b/a) 8 -barrel structure (Synstad et al. 2004). Different from GH18 family, GH19 chitinases are found only in plants and some bacteria (Ueda et al. 2003). Their steric structures reveal high a-helical contents and have an active topological structures (Kezuka et al. 2006;Huet et al. 2008). In addition, a few chitinases belonging to GH23 and GH48 families have also been discovered in recent years.
However, the chitin degradation activity of Paenibacillus xylanilyticus has not been investigated. Moreover, the genome information of P. xylanilyticus remains obscure. To provide a genome resource for investigations of the chitin degradation ability of P. xylanilyticus, and for the utilization of this bacterium, here we present the complete genome assembly, and genome annotation of P. xylanilyticus W4, a novel chitin-degrading strain isolated from soil.

Growth Conditions and Genomic DNA Isolation
Paenibacillus xylanilyticus W4, a novel chitin-degrading strain, was isolated from soil. Paenibacillus xylanilyticus W4 was grown on an M9 agar plate culture at 37 C for 3 days (Dvorak and Lorenzo 2018). A single colony was then cultivated in 50 mL of the M9 medium overnight. Cells were collected and genomic DNA was extracted using a Qiagen QIAamp DNA Mini Kit (Qiagen, Germany). The quantity and quality of the extraction were checked by agarose gel electrophoresis and the Nanodrop method and followed by Qubit quantification.

Genome Sequencing, Assembly, and Annotation
A PacBio 10-kb sequencing library was constructed with a SMRTbell template prep kit 1.0. The final library was processed for sequencing in a single-molecule real-time (SMRT) cell using P6 polymerase and C4 chemistry on a PacBio instrument. The PacBio hierarchical genome assembly process (HGAP) version 2.0 was used for the de novo assembly of the sequence reads. Given that the BluePippin Size Selection (>20,000) protocol filters out small DNA fragments, such as plasmids, Solexa paired-end sequencing data were also obtained by using the Illumina HiSeq2000. To trace the presence of any plasmid, we mapped the filtered Illumina reads by using CLCbio wb8.0 (www.clcbio.com) to the bacterial plasmid database (http://www.ebi.ac.uk/genomes/plasmid.html).

Phylogenetic Analysis and Genome-to-Genome Distance
The 16S rRNA genes of various Paenibacillus strains and neighboring families were downloaded from the National Centre for Biotechnology Information (NCBI), and were aligned by using the ClustalW module of BIOEDIT sequence alignment tool (version 7.1.3.0). A phylogenic tree based on the resulting alignment was then constructed by the Neighbor-Joining and maximum likelihood methods with the MEGA X version 10.1 program package (Kumar et al. 2018). In silico DNA-DNA hybridization (DDH) values among Paenibacillus members and other Paenibacillaceae species members were calculated by using the Genome-To-Genome Distance Calculator (GGDC) server (Auch et al. 2010).

P. xylanilyticus W4 Chitinases
Putative CAZymes in P. xylanilyticus W4 proteome were identified using the CAZy annotation pipeline (Lombard et al. 2014). The functional domains involved in chitin degradation were retrieved. The proteins of P. xylanilyticus W4 were then compared with the sequences from the CAZy protein database by using the BLASTp tool (E-value cutoff of 1e À5 ). Then, chitinases from P. xylanilyticus W4 were subjected to the BLASTp against chitinase sequences from the CAZy database, and the sequences with >70% similarity were retrieved. The chitinases from P. xylanilyticus W4 were cut into the domains and were aligned by using MUSCLE incorporated in MEGA X version 10.1 (Kumar et al. 2018). Furthermore, an evolutionary tree was generated by the maximum likelihood method (model: Jones-Taylor-Thornton; bootstrap: 100).

P. xylanilyticus W4 Genome Features
The genome sequence of P. xylanilyticus W4 was obtained using a combination of PacBio and Illumina HiSeq2000 technologies. The whole-genome sequencing by PacBio system yielded a total of 62,631 reads with a mean read length of 8.46 kb and 95.78-fold genome coverage. DNA sequencing by Illumina HiSeq2000, generated a total of 9,653,276 reads, with 126.32 bp average read length, and 220.42-fold genome coverage. After assembly, P. xylanilyticus W4 showed single circular chromosome (supplementary fig. S1, Supplementary Material online). The overall P. xylanilyticus W4 genome feature is shown in table 1. The coding proteins identified were classified into 19 functional categories according to the clusters of the orthologous groups (COG) of proteins (supplementary fig. S1, Supplementary Material online). Among all the categories, the carbohydrate transport and metabolism category (G, 15.29%) was the largest.
Phylogenetic Position of P. xylanilyticus W4 The phylogenic tree based on 16S rRNA sequences of Paenibacillus and related Paenibacillaceae members showed that P. xylanilyticus 16S rRNA forms a separate branch distinct To understand the provenance of these chitinases, we subjected their sequences to BLASTp against the Nr database. The CAZy-Nr database for GH18 and GH23 domains were retrieved and subjected to BLASTClust at 70% identity. All of these GH18 and GH23 domains along with P. xylanilyticus chitinase domains were aligned using MUSCLE incorporated in MEGA X version 10.1 and maximum likelihood phylogeny with 100 bootstrap values was constructed. The tree obtained was divided into two major clades, where the clade at the root was composed of all GH18 sequences and the other was composed of all GH23 domains ( fig. 1).
The representative chitinases from different branches show close relations with chitinases of different species of bacteria suggesting the gain and loss of the domains. In the GH23 clade, the GH23 domains orf1959 and orf4326 of P. xylanilyticus share the clade with other Paenibacillus species. The closest identified homologs of the GH18 clade have representation from diverse taxa such as Paenibacillus, Bacilli, Ralstonia, Kurthia, and Virgibacillus members. Among the GH18 domains of P. xylanilyticus W4, orf2775, orf4159, orf5084, and orf5826 share sister branches with other Paenibacillus species such as P. xylanexedens, Paenibacillus sp. BIHB4019, and Paenibacillus sp. FPU-7. Interestingly, orf2435 and orf5825 group with Bacillus circulans suggestive of possible horizontal transfer.

Supplementary Material
Supplementary data are available at Genome Biology and Evolution online.