GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank

Gillis

J.

,

Pavlidis

P.

(

2013

)

Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (cafa)

.

BMC Bioinformatics

,

14

,

S15.

Gong

Q.

et al. (

2016

)

GoFDR: a sequence alignment based method for predicting protein functions

.

Methods

,

93

,

3

–

14

.

Hamp

T.

et al. (

2013

)

Homology-based inference sets the bar high for protein function prediction

.

BMC Bioinformatics

,

14

,

S7.

Huntley

R.

et al. (

2015

)

The GOA database: gene ontology annotation updates for 2015

.

Nucleic Acids Res

.,

43

,

D1057

–

D1063

.

Jiang

Y.

et al. (

2014

)

The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective

.

Bioinformatics

,

30

,

i609

–

i616

.

Jiang

Y.

et al. (

2016

)

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

.

Genome Biol

.,

17

,

184.

Khan

I.K.

et al. (

2015

)

The PFP and ESG protein function prediction methods in 2014: effect of database updates and ensemble approaches

.

GigaScience

,

4

,

43.

Lan

L.

et al. (

2013

)

Ms-knn: protein function prediction by integrating multiple data sources

.

BMC Bioinformatics

,

14 (Suppl. 3)

,

S8

.

Li

H.

(

2011

)

A short introduction to learning to rank

.

IEICE Trans

.,

E94-D

,

1854

–

1862

.

Liu

K.

et al. (

2015

)

MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence

.

Bioinformatics

,

31

,

i339

–

i347

.

Ma

X.

et al. (

2014

)

Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks

.

Brief. Bioinformatics

,

15

,

685.

Marchler-Bauer

A.

et al. (

2015

)

CDD: nCBI’s conserved domain database

.

Nucleic Acids Res

.,

43

,

D222

–

D226

.

Mitchell

A.

et al. (

2015

)

The InterPro protein families database: the classification resource after 15 years

.

Nucleic Acids Res

.,

43

,

D213

–

D221

.

Murzin

A.

et al. (

1995

)

SCOP: a structural classification of proteins database for the investigation of sequences and structures

.

J. Mol. Biol

.,

247

,

536

–

540

.

Ofer

D.

,

Linial

M.

(

2015

)

ProFET: feature engineering captures high-level protein functions

.

Bioinformatics

,

31

,

3429

–

3436

.

Peng

S.

et al. (

2016

)

DeepMeSH: deep semantic representation for improving large-scale MeSH indexing

.

Bioinformatics

,

32

,

i70

–

i79

.

Radivojac

P.

et al. (

2013

)

A large-scale evaluation of computational protein function prediction

.

Nat. Methods

,

10

,

221

–

227

.

Shehu

A.

et al. (

2016

) A survey of computational methods for protein function prediction. In:

Wong

K.C.

(ed.)

Big Data Analytics in Genomics

, 1st edn.

Springer

, pp.

225

–

298

.

Sillitoe

I.

et al. (

2015

)

CATH: comprehensive structural and functional annotations for genome sequences

.

Nucleic Acids Res

.,

43

,

D376

–

D381

.

Sonnhammer

E.

et al. (

1997

)

Pfam: a comprehensive database of protein domain families based on seed alignments

.

Proteins

,

28

,

405

–

420

.

The UniProt Consortium

(

2015

)

Uniprot: a hub for protein information

.

Nucl Acids Res

.,

43

,

D204

–

D212

.

Vidulin

V.

et al. (

2016

)

Extensive complementarity between gene function prediction methods

.

Bioinformatics

,

32

,

3645

–

3653

.

Walker

M.G.

et al. (

1999

)

Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes

.

Genome Res

.,

9

,

1198

–

1203

.

Yuan

Q.

et al. (

2016

)

Druge-rank: improving drug-target interaction prediction of new candidate drugs or targets by ensemble learning to rank

.

Bioinformatics

,

32

,

i18

–

i27

.

Zhang

M.

,

Zhou

Z.

(

2014

)

A review on multi-label learning algorithms

.

IEEE Trans. Knowl Data Eng

,

26

,

1819

–

1837

.