BioC interoperability track overview

2

Liu

W.

Islamaj Dogan

R.

Kwon

D.

et al. . (

2014

)

BioC implementations in Go, Perl, Python and Ruby

.

Database

,

(Manuscript ID: DATABASE-2014-0031.R1, to appear in this special issue of Database)

.

3

Mao

Y.

Van Auken

K.

Li

D.

et al. . (

2014

)

Overview of the Gene Ontology Task at BioCreative IV

.

Database

,

(Manuscript ID: DATABASE-2014-0047, to appear in this special issue of Database)

.

4

Wiegers

T.C.

Davis

A.P.

Mattingly

C.J.

et al. . (

2014

)

Web services-based text-mining demonstratesbroad impacts for interoperability and process simplification

.

Database

,

bau050

.

5

Smith

L.

Rindflesch

T.

Wilbur

W.J.

(

2004

)

MedPost: a part-of-speech tagger for bioMedical text

.

Bioinformatics

,

20

,

2320

–

2321

.

6

Klein

D.

Manning

C.D.

(

2003

)

Accurate unlexicalized parsing

. In:

Proceedings of the 41st Annual Meeting on Association for Computational Linguistics

,

Vol. 1

.

Association for Computational Linguistics

,

Sapporo, Japan

, pp.

423

–

430

.

7

Comeau

D.C.

Liu

H.

Islamaj Dogan

R.

et al. . (

2014

)

Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus

.

Database

,

(Manuscript ID: DATABASE-2014-0030.R2, to appear in this special issue of Database)

.

8

Islamaj Dogan

R.

Comeau

D.C.

Yeganova

L.

et al. . (

2014

)

Finding abbreviations in biomedical literature: three BioC-compatible modules and four BioC formatted corpora

.

Database

,

doi: 10.1093/database/bau044

.

9

Schwartz

A.S.

Hearst

M.A.

(

2003

)

A simple algorithm for identifying abbreviation definitions in biomedical text

.

Pac. Symp. Biocomput.

,

451

-

462

.

10

Sohn

S.

Comeau

D.C.

Kim

W.

et al. . (

2008

)

Abbreviation definition identification based on automatic precision estimates

.

BMC Bioinformatics

,

9

,

402

.

11

Yeganova

L.

Comeau

D.C.

Wilbur

W.J.

(

2011

)

Machine learning with naturally labeled data for identifying abbreviation definitions

.

BMC Bioinformatics

,

12

(

Suppl. 3

),

S6

.

12

Khare

R.

Wei

C.-H.

Mao

Y.

et al. . (

2013

)

Improving interoperability of text mining tools with BioC

. In:

Proceedings of the Fourth BioCreative Challenge Evaluation Workshop

,

Vol. 1

, pp.

10

–

22

.

http://www.biocreative.org/resources/publications/biocreative-iv-proceedings/

.

13

Leaman

R.

Islamaj Dogan

R.

Lu

Z.

(

2013

)

DNorm: disease name normalization with pairwise learning to rank

.

Bioinformatics

,

29

,

2909

–

2917

.

14

Leaman

R.

Khare

R.

Lu

Z.

(

2013

)

NCBI at 2013 ShARe/CLEF eHealth shared task: disorder normalization in clinical notes with DNorm

.

Conference and Labs of the Evaluation Forum 2013 Working Notes

.

Valencia, Spain

.

http://www. clef2013.org/index.php?page=Pages/proceedings.php

.

15

Wei

C.H.

Harris

B.R.

Kao

H.Y.

et al. . (

2013

)

tmVar: a text mining approach for extracting sequence variants in biomedical literature

.

Bioinformatics

,

29

,

1433

–

1439

.

16

Wei

C.H.

Kao

H.Y.

Lu

Z.

(

2012

)

SR4GN: a species recognition software tool for gene normalization

.

PLoS One

,

7

,

e38460

.

17

Leaman

R.

Wei

C.H.

Lu

Z.

(

2013

)

NCBI at the BioCreative IV CHEMDNER Task: Recognizing chemical names in PubMed articles using tmChem

. In:

Proceedings of BioCreative IV

.

Bethesda, MD

.

http://www.biocreative.org/resources/publications/biocreative-iv-proceedings/

.

18

Wei

C.H.

Kao

H.Y.

(

2011

)

Cross-species gene normalization by species inference

.

BMC Bioinformatics

,

12

(

Suppl. 8

),

S5

.

19

Wei

C.H.

Kao

H.Y.

Lu

Z.

(

2013

)

PubTator: a web-based text mining tool for assisting biocuration

.

Nucleic Acids Res.

,

41

,

W518

–

W522

.

20

Stenetorp

P.

Pyysalo

S.

Topić

G.

et al. . (

2012

)

BRAT: a web-based tool for NLP-assisted text annotation

.

13th Conference of the European Chapter of the Association for Computational Linguistics

.

Association for Computational Linguistics

, pp.

102

–

107

.

21

Jimeno Yepes

A.M.N.

Verspoor

K.

(

2013

)

Brat2BioC: conversion tool between brat and BioC

. In:

Proceedings of the BioCreative IV Workshop

.

Bethesda, MD

,

Vol. 1

, pp.

46

–

53

.

http://www.biocreative.org/resources/publications/biocreative-iv-proceedings/

.

22

Rak

R.

Rowley

A.

Black

W.

et al. . (

2012

)

Argo: an integrative, interactive, text mining-based workbench supporting curation

.

Database

,

2012

,

bas010

.

23

Rak

R.

Batista-Navarro

R.

Rowley

A.

et al. . (

2013

)

NaCTeM’s BioC modules and resources for BioCreative IV

. In:

Proceedings of the Fourth BioCreative Challenge Evaluation Workshop

,

Vol. 1

, pp.

61

–

67

.

http://www.biocreative.org/resources/ publications/biocreative-iv-proceedings/

.

24

Tsai

R.T.

Chou

W.C.

Su

Y.S.

et al. . (

2007

)

BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features

.

BMC Bioinformatics

,

8

,

325

.

25

Peng

Y.

Tudor

C.O.

Torii

M.

et al. . (

2012

)

iSimp: a sentence simplification system for biomedical text

.

IEEE International Conference on Bioinformatics and Biomedicine (BIBM2012)

.

IEEE Computer Society

,

Philadelphia, PA, USA

, pp.

211

–

216

.

26

Davis

A.P.

Murphy

C.G.

Johnson

R.

et al. . (

2013

)

The comparative toxicogenomics database: update 2013

.

Nucleic Acids Res.

,

41

,

D1104

–

D1114

.

27

Verspoor

K.

Jimeno Yepes

A.

Cavedon

L.

et al. . (

2013

)

Annotating the biomedical literature for the human variome

.

Database

,

2013

,

bat019

.

28

Neves

M

Damaschun

A.

Kurtz

A.

et al. . (

2012

)

Annotating and evaluating text for stem cell research

. In:

Proceedings of the Third Workshop on Building and Evaluation Resources for Biomedical Text Mining (BioTxtM 2012) at Language Resources and Evaluation (LREC)

.

Istanbul, Turkey

, pp.

16

–

23

.

http://www.lrec-conf.org/proceedings/lrec2012/workshops/14.BioTxtM-Proceedings.pdf

.

29

Dogan

R.I.

Leaman

R.

Lu

Z.

(

2014

)

NCBI disease corpus: a resource for disease name recognition and concept normalization

.

J. Biomed. Inf.

,

47

,

1

–

10

.

Crossref

30

Pustejovsky

J.

Castano

J.

Cochran

B.

et al. . (

2001

)

Automatic extraction of acronym-meaning pairs from MEDLINE databases

.

Stud. Health Technol. Inf.

,

84

,

371

–

375

.

31

Kuo

C.J.

Ling

M.H.

Lin

K.T.

et al. . (

2009

)

BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature

.

BMC Bioinformatics

,

10

(

Suppl. 15

),

S7

.

32

Wiegers

T.C.

Davis

A.P.

Mattingly

C.J.

(

2013

)

Web services-based text mining demonstrates broad impacts for interoperability and process simplification

. In:

Fourth BioCreative Challenge Evaluation Workshop

, pp.

69

–

84

.

http://www. biocreative.org/resources/publications/biocreative-iv-proceedings/

.

33

Mao

Y.

Auken

K.V.

Li

D.

et al. . (

2013

)

The gene ontology task at BioCreative IV

. In:

Proceedings of the BioCreative IV Workshop

.

Bethesda, MD

.

http://www.biocreative.org/resources/ publications/biocreative-iv-proceedings/

.

34

Nobata

C.

Dobson

P.D.

Iqbal

S.A.

et al. . (

2011

)

Mining metabolites: extracting the yeast metabolome from the literature

.

Metabolomics

,

7

,

94

–

101

.

35

Peng

Y.

Tudor

C.O.

Torii

M.

et al. . (

2013

)

Enhancing the Interoperability of iSimp by Using the BioC Format

. In:

Fourth BioCreative Challenge Evaluation Workshop

, pp.

5

–

9

.

http://www.biocreative.org/resources/publications/biocreative-iv- proceedings/

.

36

Kim

J.-D.

Ohta

T.

Pyysalo

S.

et al. . (

2009

)

Overview of BioNLP'09 shared task on event extraction

. In:

Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task

.

Association for Computational Linguistics

,

Boulder, Colorado

, pp.

1

–

9

.

37

Kim

J.-D.

Pyysalo

S.

Ohta

T.

et al. . (

2011

)

Overview of BioNLP shared task 2011

. In:

Proceedings of the BioNLP Shared Task 2011 Workshop

.

Association for Computational Linguistics

,

Portland, Oregon

, pp.

1

–

6

.

38

Nédellec

C.

Bossy

R.

Kim

J.-D.

et al. . (

2013

)

Overview of BioNLP shared task 2013

. In:

Proceedings of the BioNLP Shared Task 2013 Workshop

.

Association for Computational Linguistics

,

Sofia, Bulgaria

, pp.

1

–

7

.

39

Tsai

R.T.H.

Chou

W.C.

Su

Y.S.

et al. . (

2007

)

BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features

.

BMC Bioinformatics

,

8

,

325

.

40

Chou

W.C.

Tsai

R.T.H.

Su

Y.S.

et al. . (

2006

)

A semi-automatic method for annotating a biomedical proposition bank

. In:

Proceedings of ACL Workshop on Frontiers in Linguistically Annotated Corpora

.

Association for Computational Linguistics

,

Sydney, Australia

, pp.

5

–

12

.

http://www.aclweb.org/anthology/W/W06/W06-0602

.

41

Dai

H.-J.

Huang

C.-H.

Lin

R.T.K.

et al. . (

2008

)

BIOSMILE web search: a web application for annotating biomedical entities and relations

.

Nucleic Acids Res.

,

36

,

W390

–

W398

.

42

Ferrucci

D.

Lally

A.

(

2004

)

UIMA: an architectural approach to unstructured information processing in the corporate research environment

.

Nat. Lang. Eng.

,

10

,

327

–

348

.

Crossref

43

Davis

A.P.

Wiegers

T.C.

Rosenstein

M.C.

et al. . (

2011

)

The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database

.

Database

,

2011

,

bar034

.

44

Davis

A.P.

Wiegers

T.C.

Johnson

R.J.

et al. . (

2013

)

Text mining effectively scores and ranks the literature for improving chemical-gene-disease curation at the comparative toxicogenomics database

.

PloS One

,

8

,

e58201

.

45

Wiegers

T.C.

Davis

A.P.

Mattingly

C.J.

(

2012

)

Collaborative biocuration-–text-mining development task for document prioritization for curation

.

Database

,

2012

,

bas037

.

46

Neveol

A.

Islamaj Dogan

R.

Lu

Z.

(

2011

)

Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction

.

J. Biomed. Inform.

,

44

,

310

–

318

.

47

Islamaj Dogan

R.

Murray

G.C.

Neveol

A.

et al. . (

2009

)

Understanding PubMed user search behavior through log analysis

.

Database

,

2009

,

bap018

.

48

Leaman

R.

Khare

R.

Lu

Z.

(

2013

)

NCBI at 2013 ShARe/CLEF eHealth Shared Task: disorder normalization in clinical notes with DNorm

.

Conference and Labs of the Evaluation Forum 2013 Working Notes

.

49

Lu

Z.

Wilbur

W.J.

(

2010

)

Overview of BioCreative III gene normalization

. In:

Proceedings of the BioCreative III workshop

.

Bethesda, MD

, pp.

24

–

45

.

http://www.biocreative.org/resources/ publications/bc-iii-workshop-proceedings/

.

50

Leaman

R.

Wei

C.-H.

Lu

Z.

(

2013

)

NCBI at the BioCreative IV CHEMDNER Task: recognizing chemical names in PubMed articles using tmChem

. In:

Proceedings of BioCreative IV

.

http://www.biocreative.org/resources/ publications/biocreative-iv-proceedings/

.

51

Van Landeghem

S.

Bjorne

J.

Wei

C.H.

et al. . (

2013

)

Large-scale event extraction from literature with multi-level gene normalization

.

PloS One

,

8

,

e55814

.

52

Wei

C.H.

Harris

B.R.

Li

D.

et al. . (

2012

)

Accelerating literature curation with text-mining tools: a case study of using PubTator to curate genes in PubMed abstracts

.

Database

,

2012

,

bas041

.

53

Wei

C.-H.

Kao

H.-Y.

Lu

Z.

(

2012

)

PubTator: A PubMed-like interactive curation system for document triage and literature curation

. In:

Proceedings of BioCreative 2012 workshop

.

Washington, DC

, pp.

145

–

150

.

http://www.biocreative.org/resources/publications/biocreative-2012-proceedings/

.

54

Auken

K.V.

Schaeffer

M.L.

McQuilton

P.

et al. . (

2013

)

Corpus Construction for the BioCreative IV GO Task

. In:

Proceedings of BioCreative IV

.

55

Arighi

C.N.

Carterette

B.

Cohen

K.B.

et al. . (

2013

)

An overview of the BioCreative 2012 Workshop Track III: interactive text mining task

.

Database

,

2013

,

bas056

.

56

Clark

S.

Curran

J.R.

(

2004

)

Parsing the WSJ using CCG and log-linear models

. In:

Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

.

Association for Computational Linguistics

,

Barcelona, Spain

, pp.

103

.

57

Liu

H.

Christiansen

T.

Baumgartner

W.A.

Jr

et al. . (

2012

)

BioLemmatizer: a lemmatization tool for morphological processing of biomedical text

.

J Biomed. Semantics

,

3

,

3

.

58

Kwon

D.

Kim

S.

Shin

S.-Y.

Wilbur

W.J.

(

2013

)

BioQRator: a Web-Based interactive biomedical literature curating system

. In:

Fourth BioCreative Challenge Workshop

, pp.

241

–

246

.

http://www.biocreative.org/resources/publications/biocreative-iv- proceedings/

.