iEnhancer-EL: identifying enhancers and their strength with ensemble learning approach

Chen

J.

et al. (

2007

)

Prediction of linear B-cell epitopes using amino acid pair antigenicity scale

.

Amino Acids

,

33

,

423

–

428

.

Chen

J.

et al. (

2016

)

dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation

.

Sci. Rep

.,

6

,

32333.

Chen

W.

et al. (

2013

)

iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition

.

Nucleic Acids Res

.,

41

,

e68.

Chen

W.

et al. (

2014

)

PseKNC: a flexible web-server for generating pseudo K-tuple nucleotide composition

.

Anal. Biochem

.,

456

,

53

–

60

.

Chen

W.

et al. (

2015a

)

Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences

.

Mol. BioSyst

.,

11

,

2620

–

2634

.

Chen

W.

et al. (

2015b

)

PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions

.

Bioinformatics

,

31

,

119

–

120

.

Chen

W.

et al. (

2018a

)

iRNA-3typeA: identifying 3-types of modification at RNA’s adenosine sites

.

Mol. Therapy Nucleic Acid

,

11

,

468

–

474

.

Chen

Z.

et al. (

2018b

)

iFeature: a python package and web server for features extraction and selection from protein and peptide sequences

.

Bioinformatics

, doi: 10.1093/bioinformatics/bty140/4924718.

Cheng

X.

et al. (

2018a

)

pLoc-mEuk: predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC

.

Genomics

,

110

,

50

–

58

.

Cheng

X.

et al. (

2018b

)

pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information

.

Bioinformatics

,

34

,

1448

–

1456

.

Cheng

X.

et al. (

2017

)

pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites

.

Bioinformatics

,

33

,

3524

–

3531

.

Chou

K.C.

(

1993

)

A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins

.

J. Biol. Chem

.,

268

,

16938

–

16948

.

Chou

K.C.

(

2001a

)

Prediction of protein cellular attributes using pseudo amino acid composition

.

Proteins Struct. Funct. Genet. (Erratum: ibid., 2001, Vol.44, 60)

,

43

,

246

–

255

.

Chou

K.C.

(

2001b

)

Prediction of protein signal sequences and their cleavage sites

.

Proteins Struct. Funct. Genet

.,

42

,

136

–

139

.

Chou

K.C.

(

2005

)

Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes

.

Bioinformatics

,

21

,

10

–

19

.

Chou

K.C.

(

2011

)

Some remarks on protein attribute prediction and pseudo amino acid composition (50th Anniversary Year Review)

.

J. Theor. Biol

.,

273

,

236

–

247

.

Chou

K.C.

(

2015

)

Impacts of bioinformatics to medicinal chemistry

.

Med. Chem

.,

11

,

218

–

234

.

Chou

K.C.

(

2017

)

An unprecedented revolution in medicinal chemistry driven by the progress of biological science

.

Curr. Top. Med. Chem

.,

17

,

2337

–

2358

.

Chou

K.C.

,

Cai

Y.D.

(

2002

)

Using functional domain composition and support vector machines for prediction of protein subcellular location

.

J. Biol. Chem

.,

277

,

45765

–

45769

.

Chou

K.C.

,

Shen

H.B.

(

2006a

)

Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization

.

Biochem. Biophys. Res. Commun. (BBRC)

,

347

,

150

–

157

.

Chou

K.C.

,

Shen

H.B.

(

2006b

)

Predicting protein subcellular location by fusing multiple classifiers

.

J. Cell. Biochem

.,

99

,

517

–

527

.

Chou

K.C.

,

Shen

H.B.

(

2007

)

Review: recent progresses in protein subcellular location prediction

.

Anal. Biochem

.,

370

,

1

–

16

.

Chou

K.C.

,

Shen

H.B.

(

2009

)

Recent advances in developing web-servers for predicting protein attributes

.

Nat. Sci

.,

01

,

63

–

92

.

Chou

K.C.

,

Zhang

C.T.

(

1995

)

Review: prediction of protein structural classes

.

Crit. Rev. Biochem. Mol. Biol

.,

30

,

275

–

349

.

Cristianini

N.

,

Shawe-Taylor

J.

(

2000

)

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Chapter 3

.

Cambridge University Press

, Cambridge, England.

Ehsan

A.

et al. (

2018

)

A novel modeling in mathematical biology for classification of signal peptides

.

Sci. Rep

.,

8

,

1039

.

Ernst

J.

et al. (

2011

)

Mapping and analysis of chromatin state dynamics in nine human cell types

.

Nature

,

473

,

43

–

49

.

Erwin

G.D.

et al. (

2014

)

Integrating diverse datasets improves developmental enhancer prediction

.

PLoS Comput. Biol

.,

10

,

e1003677

.

Fawcett

J.A.

(

2006

)

An introduction to ROC analysis

.

Pattern Recogn. Lett

.,

27

,

861

–

874

.

Feng

P.

et al. (

2017

)

iRNA-PseColl: identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC

.

Mol. Ther. Nucleic Acids

,

7

,

155

–

163

.

Feng

P.

et al. (

2018

)

iDNA6mA-PseKNC: identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC

.

Genomics

, doi: 10.1016/j.ygeno.2018.01.005.

Fernández

M.

,

Miranda-Saavedra

D.

(

2012

)

Genome-wide enhancer prediction from epigenetic signatures using genetic algorithm-optimized support vector machines

.

Nucleic Acids Res

.,

40

,

e77

–

e77

.

Firpi

H.A.

et al. (

2010

)

Discover regulatory DNA elements using chromatin signatures and artificial neural network

.

Bioinformatics

,

26

,

1579

–

1586

.

Frey

B.J.

,

Dueck

D.

(

2007

)

Clustering by passing messages between data points

.

Science

,

315

,

972

–

976

.

He

W.

,

Jia

C.

(

2017

)

EnhancerPred2.0: predicting enhancers and their strength based on position-specific trinucleotide propensity and electron–ion interaction potential feature selection

.

Mol. Biosyst

.,

13

,

767

–

774

.

Heintzman

N.D.

,

Ren

B.

(

2009

)

Finding distal regulatory elements in the human genome

.

Curr. Opin. Genet. Dev

.,

19

,

541

–

549

.

Heintzman

N.D.

et al. (

2007

)

Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome

.

Nat. Genet

.,

39

,

311

–

318

.

Jia

C.

,

He

W.

(

2016

)

EnhancerPred: a predictor for discovering enhancers based on the combination and selection of multiple features

.

Sci. Rep

.,

6

,

38741.

Jia

J.

et al. (

2015

)

iPPI-Esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC

.

J. Theor. Biol

.,

377

,

47

–

56

.

Jia

J.

et al. (

2016a

)

pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach

.

J. Theor. Biol

.,

394

,

223

–

230

.

Jia

J.

et al. (

2016b

)

pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC

.

Bioinformatics

,

32

,

3133

–

3141

.

Khan

M.

et al. (

2017

)

Unb-DPC: identify mycobacterial membrane protein types by incorporating un-biased dipeptide composition into Chou's general PseAAC

.

J. Theor. Biol

.,

415

,

13

–

19

.

Khan

Y.D.

et al. (

2018

)

iPhosT-PseAAC: identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC

.

Anal. Biochem

.,

550

,

109

–

116

.

Kleftogiannis

D.

et al. (

2015

)

DEEP: a general computational framework for predicting enhancers

.

Nucleic Acids Res

.,

43

,

e6

–

e6

.

Li

W.

,

Godzik

A.

(

2006

)

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences

.

Bioinformatics

,

22

,

1658

–

1659

.

Lin

C.

et al. (

2014a

)

LibD3C: ensemble classifiers with a clustering and dynamic selection strategy

.

Neurocomputing

,

123

,

424

–

435

.

Lin

H.

et al. (

2014b

)

iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition

.

Nucleic Acids Res

.,

42

,

12961

–

12972

.

Liu

B.

et al. (

2014

)

Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection

.

Bioinformatics

,

30

,

472

–

479

.

Liu

B.

(

2018

)

BioSeq-Analysis: a platform for DNA, RNA, and protein sequence analysis based on machine learning approaches

.

Brief. Bioinf

., doi: 10.1093/bib/bbx165.

Liu

B.

et al. (

2015

)

repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects

.

Bioinformatics

,

31

,

1307

–

1309

.

Liu

B.

et al. (

2016a

)

iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition

.

Bioinformatics

,

32

,

362

–

369

.

Liu

B.

et al. (

2016b

)

iDHS-EL: identifying DNase I hypersensi-tivesites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework

.

Bioinformatics

,

32

,

2411

–

2418

.

Liu

B.

et al. (

2017a

)

iRSpot-EL: identify recombination spots with an ensemble learning approach

.

Bioinformatics

,

33

,

35

–

41

.

Liu

B.

et al. (

2017b

)

2L-piRNA: a two-layer ensemble classifier for identifying piwi-interacting RNAs and their function

.

Mol. Therapy Nucleic Acids

,

7

,

267

–

277

.

Liu

L.M.

et al. (

2017c

)

iPGK-PseAAC: identify lysine phosphoglycerylation sites in proteins by incorporating four different tiers of amino acid pairwise coupling information into the general PseAAC

.

Med. Chem

.,

13

,

552

–

559

.

Liu

B.

et al. (

2018a

)

iRO-3wPseKNC: identify DNA replication origins by three-window-based PseKNC

.

Bioinformatics

, doi: 10.1093/bioinformatics/bty312/4978052.

Liu

B.

et al. (

2018b

)

iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC

.

Bioinformatics

,

34

,

33

–

40

.

Lodhi

H.

et al. (

2002

)

Text classification using string kernels

.

J. Mach. Learn. Res

.,

2

,

419

–

444

.

Luo

L.

et al. (

2016

)

Accurate prediction of transposon-derived piRNAs by integrating various sequential and physicochemical features

.

PLoS ONE

,

11

,

e0153268.

Meher

P.K.

et al. (

2017

)

Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou's general PseAAC

.

Sci. Rep

.,

7

,

42362

.

Mitchell

M.

(

1998

)

An Introduction to Genetic Algorithms

.

MIT Press

.

Nair

A.S.

,

Sreenadhan

S.P.

(

2006

)

A coding measure scheme employing electron–ion interaction pseudopotential (EIIP)

.

Bioinformation

,

1

,

197

–

202

.

PubMed

Omar

N.

et al. (

2017

)

Enhancer prediction in proboscis monkey genome: a comparative study

.

J. Telecommun. Electron. Comput. Eng. (JTEC)

,

9

,

175

–

179

.

Qiu

W.R.

et al. (

2017

)

iKcr-PseEns: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier

.

Genomics

, doi: 10.1016/j.ygeno.2017.10.008.

Rahimi

M.

et al. (

2017

)

OOgenesis_Pred: a sequence-based method for predicting oogenesis proteins by six different modes of Chou's pseudo amino acid composition

.

J. Theor. Biol

.,

414

,

128

–

136

.

Rajagopal

N.

et al. (

2013

)

RFECS: a random-forest based algorithm for enhancer identification from chromatin state

.

PLoS Comput. Biol

.,

9

,

e1002968.

Shao

J.

et al. (

2009

)

Computational identification of protein methylation sites through bi-profile Bayes feature extraction

.

PLoS One

,

4

,

e4920.

Shen

H.B.

,

Chou

K.C.

(

2009

)

QuatIdent: a web server for identifying protein quaternary structural attribute by fusing functional domain and sequential evolution information

.

J. Proteome Res

.,

8

,

1577

–

1584

.

Shlyueva

D.

et al. (

2014

)

Transcriptional enhancers: from properties to genome-wide predictions

.

Nat. Rev. Genet

.,

15

,

272

–

286

.

Song

J.

et al. (

2018a

)

PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy

.

Bioinformatics

,

34

,

684

–

687

.

Song

J.

et al. (

2018b

)

PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural and network features in a machine learning framework

.

J. Theor. Biol

.,

443

,

125

–

137

.

Song

J.

et al. (

2018c

)

iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites

.

Brief. Bioinf

., doi: 10.1093/bib/bby028.

Tahir

M.

et al. (

2017

)

Sequence based predictor for discrimination of enhancer and their types by applying general form of Chou's trinucleotide composition

.

Comput. Methods Programs Biomed

.,

146

,

69

–

75

.

Visel

A.

et al. (

2009

)

ChIP-seq accurately predicts tissue-specific activity of enhancers

.

Nature

,

457

,

854

–

858

.

Wang

J.

et al. (

2017

)

POSSUM: a bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles

.

Bioinformatics

,

33

,

2756

–

2758

.

Wang

J.

et al. (

2018

)

Bastion6: a bioinformatics approach for accurate prediction of type VI secreted effectors

.

Bioinformatics

, doi: 10.1093/bioinformatics/bty155.

Xiao

X.

et al. (

2013

)

iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types

.

Anal. Biochem

.,

436

,

168

–

177

.

Xiao

X.

et al. (

2017

)

pLoc-mGpos: incorporate key gene ontology information into general PseAAC for predicting subcellular localization of Gram-positive bacterial proteins

.

Nat. Sci

.,

9

,

331

–

349

.

Xu

Y.

et al. (

2013a

)

iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition

.

PLoS ONE

,

8

,

e55844

.

Xu

Y.

et al. (

2013b

)

iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins

.

PeerJ

,

1

,

e171.

Xu

Y.

et al. (

2014

)

iNitro-Tyr: prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition

.

PLoS One

,

9

,

e105018.

Xu

Y.

et al. (

2017

)

iPreny-PseAAC: identify C-terminal cysteine prenylation sites in proteins by incorporating two tiers of sequence couplings into PseAAC

.

Med. Chem

.,

13

,

544

–

551

.

Yang

B.

et al. (

2017

)

BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone

.

Bioinformatics

,

33

,

1930

–

1936

.

Yang

H.

et al. (

2018

)

iRSpot-Pse6NC: identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC

.

Int. J. Biol. Sci

.,

14

,

883

–

891

.

Yasser

E.M.

et al. (

2008

)

Predicting flexible length linear B-cell epitopes

.

Computational Systems Bioinformatics

,

7

,

121

–

132

.

PubMed