AntiTbPdb: a knowledgebase of anti-tubercular peptides

Abstract Tuberculosis is a global menace, caused by Mycobacterium tuberculosis, responsible for millions of premature deaths every year. In the era of drug-resistant tuberculosis, peptide-based therapeutics may provide alternate to small molecule based drugs. In order to create knowledgebase, AntiTbPdb (http://webs.iiitd.edu.in/raghava/antitbpdb/), experimentally validated anti-tubercular and anti-mycobacterial peptides were compiled from literature. We curate 10 652 research articles and 35 patents to extract anti-tubercular peptides and annotate these peptides manually. This knowledgebase has 1010 entries, each entry provides extensive information about an anti-tubercular peptide such as sequence, chemical modification, chirality, nature and source of origin. The tertiary structure of these anti-tubercular peptides containing natural as well as chemically modified residues was predicted using PEPstrMOD and I-TASSER. In addition to structural information, database maintains other properties of peptides like physiochemical properties. Numerous web-based tools have been integrated for data retrieval, browsing, sequence similarity search and peptide mapping. In order to assist wide range of user, we developed a responsive website suitable for smartphone, tablet and desktop. Database URL: http://webs.iiitd.edu.in/raghava/antitbpdb/


Introduction
Tuberculosis (TB) is one of the deadliest diseases of mankind caused by Mycobacterium tuberculosis (M. tuberculosis). According to WHO report 2016, TB infects 10.4 million people worldwide and cause the death of 1.8 million people in year 2015. This put the TB as a second world deadliest disease after HIV/AIDS. The initial anti-TB drugs isoniazid, streptomycin introduced some 50 years ago, led to the optimization that TB could be eradicated. Due to this inception in mind, pharmaceutical industry develops only few drugs in the past few years for the treatment of TB. Since 1980 s the TB has undergone a revivification impelled V C The Author(s) 2018. Published by Oxford University Press.

Page 1 of 8
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
(page number not for citation purposes) by a number of factors like increase in immunosuppressed Patient (1). Current TB treatment strategy is far from satisfactory as overall treatment duration is long, approximately equal to 12 months (2) as well as requires daily administration of drugs, which are toxic and less effective in case of multi-drug or extensive drug resistance TB as delineated by its high mortality rate (15%). There is a need of hour to discover or design effective drugs against tuberculosis, particularly against multi-drug resistance (MDR-TB), extensive drug resistance (XDR-TB) and emerging extreme drug resistance strain (XXDR-TB) (3). Due to continuous failure of antibiotics against drug resistant bacteria, pharmaceuticals industries are looking for alternative strategy. One of the possible alternatives to antibiotics (traditional drug mainly based on small molecules) is peptide based therapeutics. A significant and vast majority of peptides from plant, bacteria and fungus sources proven to have anti-microbial action and these can be used as a supplement or alternate for conventional antibiotics (4). In last few decades, researchers have screened several peptides effective against Mycobacterium, which has been used for TB treatment in different therapeutic strategy like as single anti-TB agents (5), in combination with conventional drugs (6) and synergistic effect with traditional antibiotic therapy (7).
The emergence of anti-tubercular peptide as a promising therapeutic candidate is due to its specific molecular action as selective affinity towards cell envelope, low toxicity, low immunogenicity and targeted immune response against invading pathogens (8,9). Several anti-tubercular peptides have shown promising results in various preclinical studies and their scattered information in literature is difficult to access. The huge therapeutic importance of anti-TB peptides, motivate us to make a database fully dedicated to anti-tubercular or anti-mycobacterial peptides. Several mutant protein or protein complex databases (10,11), peptide database of therapeutic importance (12)(13)(14)(15)(16)(17)(18) as well as anti-microbial database (19-23) covering some anti-mycobacterial peptides exists, but to the best of our knowledge, no single repository fully dedicated to anti-tubercular or anti-mycobacterial peptide database exist, till date.

Data acquisition
Anti-tubercular peptides were manually collected from research articles and patents. Combination of keywords like 'anti-tubercular peptide', 'anti-tuberculosis peptide', 'anti-mycobacterial peptides' and 'anti-microbial peptides against M. tuberculosis' in PubMed search criteria resulted into 10 652 articles. All the articles were manually screened for relevant experimental information, and 900 articles were filtered. Further reviews and articles lacking relevant information were excluded and final data were curated from 96-research article. Similarly, patents were searched from USPTO with same keywords and resulted into 35 patents, and after careful reading data from 5 patents, which were openly available in native English language, were curated.

Database architecture and web interface
AntiTbPdb is built on Apache HTTP server (version 2.2.17), which is installed on machine with Ubuntu as operating system. The responsive front-end, which is suitable for mobile, tablet and desktop, was developed using HTML5, CSS3, PHP5 and JavaScript. MySQL (a relational database management system, version 5.0.51 b) was used at the back-end to manage the data. The architecture of AntiTbPdb is shown in Figure 1.

Database organization
Broadly, the data in this repository or database can be divided in two categories namely primary and secondary. Primary data includes manually curated information from research articles and patents like peptide name, its sequence, end terminal as well as chemical modification, its nature and source of origin. Beside this Mycobacterium species and its corresponding strain, cell lines and animal model that have been used to check peptides efficacy, inhibition concentration and cytotoxicity are also maintain under this category. We have also carefully looked-for peptide's immune response directed against anti-TB peptides and wherever the information is available has been included in the database.
Secondary data includes the information derived from primary data such as physiochemical properties, amino acid compositions and frequency of anti-tubercular peptides. In-house PERL scripts have been used to calculate frequency, composition and physiochemical properties. Since structure plays an important role while elucidating peptides function, therefore we have also annotated structure of most of the peptides stored in AntiTbPdb. Following strategies have been adopted for extracting structure of peptides from existing resources and for predicting structure of peptides. Firstly, existing repositories of experimentally determined structures such as PDB or PubChem were examined for assigning structure of antitubercular peptides. Secondly, peptide structure prediction method PEPstrMOD (24) was used for remaining peptides whose sequence length is up 25 amino acids. PEPstrMOD predicts both natural as well as modified amino acids like ornithine, N-methyl alanine etc. Finally, we used I-TASSER (25) for predicting structure of remaining peptides.

Data retrieval tools
Searching This module of AntiTbPdb has been designed to facilitate easy searching of data using simple and advanced search options. In simple search module, the user can give the query against any filed of the database such as name, sequence, inhibition concentration, Mycobacterium species, N or C-terminal modification, chemical modification, mechanism of action, target and PMID or patent number. This option allows the output customized according to the search query. In the advanced search module, the user can give multiple queries simultaneously with Boolean expressions (e.g. AND, OR and NOT).

Browsing
A user-friendly browsing interface has been developed to facilitate easy retrieval of the information. We have computed physicochemical properties of each anti-tubercular peptide such as the hydrophobicity, aliphaticity, aromaticity, polar or neutral charge, negative or positive charge.
Users can browse peptides entries for the desired physicochemical properties. Various cross-linked browsable tables have been provided to easily access the data. The users can browse on major fields: (i) peptides (e.g. Peptide type, Source, Chirality and nature) (ii) modification of peptides (e.g. N-or C-terminal modification, chemical modification), (iii) length of peptide, (iv) Mycobacterium species (classified as pathogenic, non-pathogenic and opportunistic) and (v) moonlight; many peptides have shown bactericidal activity against more than one Mycobacterium species, for the easiness of users, they can be browsed under moonlight category.

Composition
In addition to the information on anti-tubercular peptide sequences, the most relevant physicochemical properties of peptides are calculated like charge, hydrophobicity, amphipathicity, aromaticity, etc. This is a very important analysis tool, which helps user to analyse and retrieve peptide with desired amino acid composition/frequency and physicochemical properties. This tool has five modules: (i) amino acid (AA) composition, (ii) AA frequency, (iii) physicochemical property (PP) composition, (iv) PP frequency and (v) secondary structure (SS) composition. SS composition module assists user to explore anti-tubercular peptides based on their secondary structure composition. For example, using this tool, user can retrieve all anti-tubercular peptide, which have high composition of turn or helical state.

Sequence alignment
In pursuance of sequence-similarity based search, we have integrated several alignment-based web tools. These similarity-based search tools include BLAST and Smith-Waterman algorithm. Users have to submit their protein or peptide in FASTA format with desired or default parameters of BLAST (26). The server implements BLAST search against primary structure or sequence of all the peptides stored in the database, for query sequence. Furthermore, Smith-Waterman algorithm (27) has also been integrated. In addition to sequence-similarity based search, sequencemapping based on identical residues is also implemented in the webserver i.e. sub-search and super-search. Sub-search module could be used for mapping query peptide against all the peptides deposited in AntiTbPdb, whereas super-search allows peptide mapping and identification of segments that are identical to peptides stored in AntiTbPdb.

Results
AntiTbPdb is a unique repository of experimentally verified anti-tubercular or anti-mycobacterial peptides. It contains 1010 entries of 542 unique peptides. It covers diverge range of anti-mycobacterial or anti-tubercular peptide entries that include linear (668), cyclic (279), entry of peptides with L-amino acids (771), D-amino acids (124) and having both L and D amino acids (44). Most of the entries of anti-tubercular peptides are screened from natural sources (434) or derived from specific proteins (112). Its analysis revealed that most of entries of anti-tubercular peptides (672) are shorter in length ranging from 06 to 20 amino acids but some peptide fragments derived from proteins (ex RNase) are even having 70 amino acids ( Figure 2). These peptides are also classified based on their nature like cationic, amphipathic, basic etc. In AntiTbPdb, around 619 entries of cationic and 51 show amphipathic nature. Beside this, many are hydrophobic (67) and basic (19) in nature as well.
Researchers have done several modifications to enhance the efficacy of peptides against Mycobacterium species. Both end terminal (N and C terminal) as well as chemical modification such as use of non-natural amino acids, disulphide linkage and incorporation of other chemical groups (Didehydroaminobutyric acid, 2-aminobutyric acid, tetrahydroisoquinolinecarboxylic acid etc.) have been explored to see the efficacy of peptide against several Mycobacterium species. Most of them (550) have been studied against M. tuberculosis but the high cost and requirement of highly equipped biosafety level -3, enforced researchers to conduct their research using Mycobacterium smegmatis. Therefore, a large number of studies have been conducted against M. smegmatis (202). Beside this, several other Mycobacterium species have been explored to see the efficacy of peptide against these species (Figure 3). We have incorporated a moonlighting browse facility, which will be very helpful to search all the peptides, which have activity against more than one Mycobacterium Species. For example, user can search the peptide, which has been studied against M. tuberculosis as well as M. marinum species.

Comparison with other anti-microbial peptide databases
Several anti-microbial peptide databases exist, and we examined three heavily used and most updated existing databases of anti-microbial peptides i.e. APD3, DBAASP v2, CAMPR3 and observed that large number of antitubercular and anti-mycobacterial peptides are not covered in these databases. As compare to 542 unique peptides in AntiTbPdb around 78, 82 and 310 peptides were covered in APD3, CAMPR3 and DBASSP v2, respectively. It means two of them cover around 15% peptides maintained in AntiTbPdb. In addition, AntiTbPdb provides more information about each peptide than other resources. Though all three databases provide information about large number of anti-microbial peptides but AntiTbPdb provides more information about anti-tuberculosis and antimycobacterial peptides. In summary, AntiTbPdb will complement existing databases in providing tuberculosisspecific information.

Limitations and update of AntiTbPdb
AntiTbPdb also provides structural information of peptides but there are some peptide sequences which have modification and other moieties which are beyond the scope of art of prediction techniques involved in PEPstrMOD. Beside this, there are few peptides and peptide-antibiotics, whose sequence information is not available in the literature, these peptides have been stored without sequence information. As soon as the sequence will be known, it will be updated.
The web server allows users to submit new entry of anti-mycobacterial peptide on-line by filling HTML form. However, we will confirm the validity of new entry, to maintain the quality of resource. Our team is also searching and adding new entries of anti-tubercular peptides in database from published literature. Attempts will be made to update this database regularly twice a year.

Discussion
Despite the huge research against Mycobacterium, it is still a major concern for mankind. Emergence of drug resistant strain and limitations of conventional therapy, enforced researchers to work further to fully eradicate this disease. Among the promising candidates, anti-mycobacterial peptides emerged as potential therapeutic candidates. Most of the current anti-mycobacterial peptides are derived from some natural source such as either by bacterial extraction, mycobacteriophages or from host immune cells. Peptides derived from the proteins of innate immune system such as HNPs and defensins has already been studied for their effectiveness against broad range of viruses, fungi and bacteria including M. tuberculosis (39). The major mechanism behind the bactericidal activity of peptides is membrane disruption through pore formation. Beside this many of the anti-tubercular peptides also act on a secondary nonmembrane target, thus making it effective against M. tuberculosis, for e.g. HNP-1 kills M. tuberculosis by disrupting cell envelope as well as targeting DNA (40). One of the major reasons of resistance against drugs is thought to be the inability to penetrate cell wall and it is a well-known fact that Mycobacterium persist in host cells and modulate the host immune response in its own favor, therefore effective anti-TB agents must be able to penetrate through macrophages and perform effective intracellular killing without molesting other cells. In this ambience, anti-mycobacterial peptides have been proven very effective by the virtue of their affinity towards cell envelope, low immunogenicity and diverse mode of actions through interacting with secondary intracellular targets such as nucleic acid, enzymes and even organelle (37,41,42). In vivo stability, oral bioavailability, and short half-life are the major obstacle in their use as therapeutic. This has been conquered by structural constrains and introducing non-natural amino acids as well as by several chemical modifications.
Understanding of structural-functional relationship of peptides with their molecular mechanism of action, and their immune-modulating properties in infectious disease will be very useful in future drug discoveries. Thus, we anticipate that, AntiTbPdb will be a very useful resource for the researcher to design novel anti-tubercular peptide and further characterization of already existing effective anti-tubercular peptide to kick it up a notch.