The InterPro BioMart: federated query and web service access to the InterPro Resource

Jones, Philip; Binns, David; McMenamin, Conor; McAnulla, Craig; Hunter, Sarah

doi:10.1093/database/bar033

Abstract

The InterPro BioMart provides users with query-optimized access to predictions of family classification, protein domains and functional sites, based on a broad spectrum of integrated computational models (‘signatures’) that are generated by the InterPro member databases: Gene3D, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. These predictions are provided for all protein sequences from both the UniProt Knowledge Base and the UniParc protein sequence archive. The InterPro BioMart is supplementary to the primary InterPro web interface (http://www.ebi.ac.uk/interpro), providing a web service and the ability to build complex, custom queries that can efficiently return thousands of rows of data in a variety of formats. This article describes the information available from the InterPro BioMart and illustrates its utility with examples of how to build queries that return useful biological information.

Database URL: http://www.ebi.ac.uk/interpro/biomart/martview.

Introduction

The InterPro Resource (http://www.ebi.ac.uk/interpro) (1) provides an integrated set of computational models (or signatures) for protein family classification and the prediction of structural and functional domains, sites and repeats. The predictive signatures are built by the 11 InterPro member databases that, together with the InterPro team at the EBI, comprise the InterPro Consortium. The member databases are Gene3D (2), HAMAP (3), PANTHER (4), Pfam (5), PIRSF (6), PRINTS (7), ProDom (8), PROSITE (9), SMART (10), SUPERFAMILY (11) and TIGRFAMs (12). The InterPro team at the EBI integrates the predictive signatures from these member databases into ‘InterPro Entries’. Each entry may include one or more signatures that either identify the same feature, or classify proteins into the same family. Additionally, entries are collated into two biologically principled hierarchies, one of which describes protein families, the other protein domains. InterPro entries are curated by a team of experts in a variety of fields in Biology. The curation process includes the creation of entries, the structuring of the entry hierarchies, the provision of detailed abstracts describing each entry and the addition of useful cross-references to other databases and ontologies. An example InterPro Entry, as viewed on the main InterPro website, is illustrated in Figure 1. This example entry comprises two member database signatures, one from Pfam and the other from SuperFamily. In total, this InterPro entry matches 2753 UniProtKB protein sequences.

Figure 1.

Open in new tab Download slide

An example human-curated InterPro entry, illustrating the detailed description provided for the entry and cross references to the GO and the member database signatures from which the entry is composed.

The integration described above is useful because the individual member databases have distinct but overlapping interests and use a number of different algorithms and modeling techniques. From the perspective of the biologist or bioinformatician wishing to use these predictive techniques, InterPro allows consideration of all of the available signatures from a single resource, without the need to be concerned with differences or overlap between the foci of the individual member databases. As well as integrating the member database signatures, InterPro calculates matches to these signatures for the whole of the UniProt Knowledge Base (UniProtKB, http://www.uniprot.org) and the UniParc sequence archive (13). Figure 2 illustrates a set of matches to a single UniProtKB protein sequence, which matches three InterPro entries. The matches of InterPro signatures and entries to the sequences in UniProtKB are available from the main InterPro website as well as from the InterPro BioMart, however, at the time of writing, the matches to UniParc sequences are only available from the BioMart. It is expected that UniParc matches will be included in a future version of the main InterPro website. The InterPro BioMart is built on the technology developed by the BioMart project (http://www.biomart.org) (14, 15), a collaboration between the Ontario Institute for Cancer Research (OICR) and the European Bioinformatics Institute (EBI). The InterPro BioMart is available at http://www.ebi.ac.uk/interpro/biomart/martview. It is also incorporated into the BioMart Central Server at http://www.biomart.org/biomart/martview (16).

Figure 2.

Open in new tab Download slide

A protein for which matches have been calculated by InterPro. For this sequence, InterPro provides a prediction of protein family membership, an overview of the domain organization and the details of matches to member database signatures. At the foot of the view can be seen associated GO terms, based upon the calculated matches to InterPro entries.

The adoption of BioMart as a mechanism to share the data in InterPro has been motivated by the benefits that BioMart brings: the ability to build complex filters on the data; the facility to select specifically which data types are returned (equivalent to the columns of a spreadsheet); the capacity of BioMart to handle queries that return many thousands of rows of data and the provision of a web service with an associated data federation mechanism.

Data content

The InterPro BioMart provides three data sources: ‘InterPro Entry Annotation’, ‘UniProtKB Protein Matches’ and ‘UniParc Protein Matches’.

Match information can be obtained from both the ‘InterPro Entry Annotation’ data source and the ‘UniProtKB Protein Matches’ data source. These two data sources provide a different slant on the contents of InterPro, as described below.

The ‘InterPro Entry Annotation’ data source focuses on descriptions of the InterPro entries and the hierarchical relationships between them. The user can therefore build filters using this annotation and retrieve more detailed information, such as assigned Gene Ontology (19) terms and cross-references to other, related databases. The ‘Query Examples’ section below illustrates a potential application of this data set.

The ‘UniProtKB Protein Matches’ data set is focused on the UniProtKB protein entity, allowing queries to be built based on attributes of the protein sequence, including options to filter on the taxonomic group annotated on the sequence. This data set also provides the opportunity to retrieve match information with respect to member database signatures as well as summarized match information, described as ‘supermatches’ in the BioMart. A ‘supermatch’ is determined where one or more member database signatures that have been integrated together into the same entry have overlapping matches to the protein in the same region of the sequence. The start and stop coordinates of the InterPro entry ‘supermatch’ are then calculated as the most extreme bounds of the matches of all the member databases’ signatures comprising the entry.

Finally the ‘UniParc Protein Matches’ data set provides equivalent information to the ‘UniProtKB Protein matches’ data set, coordinated on sequences included in the UniParc database, a non-redundant, historical archive of protein sequences extracted from public databases. At the time of writing, the UniParc database includes 25.6 million unique sequences; the InterPro match calculation pipeline is run against all of these sequences and the results are made available from this BioMart data set. This service allows matches to be returned for sequences that are present in (for example) model organism protein sequence databases which are not yet represented in UniProtKB. For users interested in matches for specific protein sequences, this data set supports filtering by UniParc ID or sequence checksum (CRC-64 or MD5), as does the ‘UniProtKB Protein Matches’ data set. If the user wishes to query using protein accessions or identifiers from third-party sequence databases, various services are available that allow protein identifier cross-referencing, including the Protein Identifier Cross Reference service, PICR (http://www.ebi.ac.uk/Tools/picr/) (20) and the UniProt ID mapping service (http://www.uniprot.org/). Both of these services can be used to convert protein identifiers or accessions from a large number of protein sequence databases to UniParc sequence identifiers.

The three InterPro BioMart data sources include matches to the full taxonomy range in UniProtKB or UniParc. In this respect, the InterPro BioMart is different in structure to the Ensembl BioMart (http://www.ensembl.org/biomart/martview) (17, 18) which is organized into species-specific data sets.

Services supported by the InterPro BioMart

The InterPro BioMart is used to extend the functionality of the primary InterPro web interface, providing BioMart ‘canned queries’ for InterPro entries and for matched proteins. This allows data to be downloaded in tab- or comma-separated values format, suitable for computational analysis.

The InterPro BioMart web service is the data source behind the InterPro Distributed Annotation System (DAS) service (21), available from http://www.ebi.ac.uk/das-srv/interpro/das. This DAS service provides four DAS sources that query the BioMart.

‘InterPro’, which contains all InterPro member database signature matches to UniProtKB protein sequences.
‘InterPro-matches-overview’ that provides the maximum extent of the matches from all signatures that are integrated into a single InterPro entry against UniProtKB protein sequences. These are the ‘supermatch’ matches described in the BioMart ‘UniProtKB Protein Matches’ data set.
‘InterPro-UniParc-matches’ that provides match information for protein sequences identified using UniParc identifiers.
‘InterPro-S4’ is used to provide protein family classification to the new EBI Search Service and is therefore part of the wider programme of data integration at the EBI.

Query examples

In common with all BioMart implementations, the InterPro BioMart enables the construction of simple queries as well as complex, multi-faceted queries where the data is filtered on several criteria. Where multiple filters are applied, records are returned that meet all of the filter criteria (i.e. ‘AND’ logic is applied across the filters). The user is able to specify precisely which data attributes should be returned, equivalent to columns in a spreadsheet.

Users should be aware that the structure of a BioMart database, which is highly redundant to facilitate high query speed, can result in redundancy in the results reported. The presence of repeated rows of results in the output depends on the construction of the query and the structure of the underlying BioMart tables. The circumstances under which this may occur are not self-evident. The authors therefore recommend the use of the ‘Unique results only’ option when querying the BioMart, which removes repeated rows of results.

To demonstrate the utility of the InterPro BioMart, here we present several biologically relevant queries

Query #1. ‘Which Pfam signatures does InterPro integrate into “family” entries?’

Data sets	Filters	Attributes
InterPro Entry Annotation	InterPro Entry Type: ‘Family’	InterPro Entry Accession
	Source Signature Database: ‘Pfam’	InterPro Entry Short Name
		Signature Accession
		Signature ID (Name)

Data sets	Filters	Attributes
InterPro Entry Annotation	InterPro Entry Type: ‘Family’	InterPro Entry Accession
	Source Signature Database: ‘Pfam’	InterPro Entry Short Name
		Signature Accession
		Signature ID (Name)

Open in new tab

Data sets	Filters	Attributes
InterPro Entry Annotation	InterPro Entry Type: ‘Family’	InterPro Entry Accession
	Source Signature Database: ‘Pfam’	InterPro Entry Short Name
		Signature Accession
		Signature ID (Name)

Data sets	Filters	Attributes
InterPro Entry Annotation	InterPro Entry Type: ‘Family’	InterPro Entry Accession
	Source Signature Database: ‘Pfam’	InterPro Entry Short Name
		Signature Accession
		Signature ID (Name)

Open in new tab

The Pfam database contains a broad spectrum of hidden Markov models that can be used to predict both family classification and domain organization. The InterPro curation team has integrated >96% of Pfam signatures into InterPro at the time of writing. During integration, InterPro assigns a ‘type’ to an InterPro entry and by extension, its signatures, dependent on what is being represented (a Family, Domain, Site or Repeat). Using the BioMart, it is possible to return the full set of integrated Pfam signatures that InterPro considers to be of type ‘family’. This query can be easily modified to request signatures built by any of the member databases that fit into any of the available InterPro entry types. Entry type filters include ‘Active_site’, ‘Binding_site’, ‘Conserved_site’, ‘Domain’, ‘Family’, ‘PTM’ (Post Translational Modification) and ‘Repeat’. Each InterPro entry has exactly one type and consequently all integrated member database signatures also have one type, as assigned by the InterPro curation team.

This example query is illustrated with a series of screen shots. Figure 3 illustrates selection of the InterPro Entry Annotation data set. Following selection of this data set, the user is able to select filters and attributes (in whichever order they choose). Figure 4 illustrates the selection of the two filters applied in this query, which will restrict the rows of data returned. Figure 5 illustrates the selection of attributes, which are equivalent to the columns of a spreadsheet. Finally, Figure 6 illustrates the results that are obtained when the ‘Results’ button is pressed. Initially, the user is presented with the first 10 matching rows of data, giving an opportunity to refine the query prior to requesting the full set of results.

Figure 3.

Open in new tab Download slide

Selecting a dataset in the InterPro BioMart.

Figure 4.

Open in new tab Download slide

Building a filter with two components: include results for ‘Family’ entry types that comprise signatures from Pfam.

Figure 5.

Open in new tab Download slide

Selecting the attributes to be included in the BioMart output (equivalent to the columns of a spreadsheet). The ordering of the columns is determined by the order in which the attributes are selected.

Figure 6.

Open in new tab Download slide

Clicking the ‘Results’ button at the top of the interface provides the first 10 results matching the query, to allow the query to be modified or improved.

Query #2. ‘Which GO terms are mapped to PROSITE signatures in InterPro (i.e. Can I retrieve a PROSITE2GO mapping?)’

Data sets	Filters	Attributes
InterPro Entry Annotation	Source Signature Database : ‘PROSITE patterns’ and ‘PROSITE Profiles’ (CTRL click to select both).	InterPro Entry Accession
		Signature Accession
		GO ID
		GO Term Name
		GO Root Term (Process/Component/Function)

Data sets	Filters	Attributes
InterPro Entry Annotation	Source Signature Database : ‘PROSITE patterns’ and ‘PROSITE Profiles’ (CTRL click to select both).	InterPro Entry Accession
		Signature Accession
		GO ID
		GO Term Name
		GO Root Term (Process/Component/Function)

Open in new tab

Data sets	Filters	Attributes
InterPro Entry Annotation	Source Signature Database : ‘PROSITE patterns’ and ‘PROSITE Profiles’ (CTRL click to select both).	InterPro Entry Accession
		Signature Accession
		GO ID
		GO Term Name
		GO Root Term (Process/Component/Function)

Data sets	Filters	Attributes
InterPro Entry Annotation	Source Signature Database : ‘PROSITE patterns’ and ‘PROSITE Profiles’ (CTRL click to select both).	InterPro Entry Accession
		Signature Accession
		GO ID
		GO Term Name
		GO Root Term (Process/Component/Function)

Open in new tab

A major use of InterPro is the association of GO terms to proteins via the signatures that they match. InterPro provides the ‘InterPro2GO’ mappings as a file that can be downloaded from the FTP site; however, it is difficult to extract subsets of information from this file. In the past, a frequent request from users was the provision of GO term mapping information for a particular member database. With the advent of the BioMart, it is now very easy to provide this information as illustrated above.

Query #3. ‘Which metabolic pathways are associated with proteins matching the InterPro family “Chemokine receptor type 4” (CXCR4, IPR001277)?’

Data sets	Filters	Attributes
pathway		Pathway stable ID
		Pathway name
InterPro Entry Annotation	InterPro Entry ID = ‘IPR001277’	InterPro Entry Accession
		InterPro Entry Name
		UniProtKB Protein Accession
		UniProtKB Protein ID (Name)
		Source Signature Database
		Signature Accession
		Signature ID (Name)
		Match Start Position
		Match Stop Position

Data sets	Filters	Attributes
pathway		Pathway stable ID
		Pathway name
InterPro Entry Annotation	InterPro Entry ID = ‘IPR001277’	InterPro Entry Accession
		InterPro Entry Name
		UniProtKB Protein Accession
		UniProtKB Protein ID (Name)
		Source Signature Database
		Signature Accession
		Signature ID (Name)
		Match Start Position
		Match Stop Position

Open in new tab

Data sets	Filters	Attributes
pathway		Pathway stable ID
		Pathway name
InterPro Entry Annotation	InterPro Entry ID = ‘IPR001277’	InterPro Entry Accession
		InterPro Entry Name
		UniProtKB Protein Accession
		UniProtKB Protein ID (Name)
		Source Signature Database
		Signature Accession
		Signature ID (Name)
		Match Start Position
		Match Stop Position

Data sets	Filters	Attributes
pathway		Pathway stable ID
		Pathway name
InterPro Entry Annotation	InterPro Entry ID = ‘IPR001277’	InterPro Entry Accession
		InterPro Entry Name
		UniProtKB Protein Accession
		UniProtKB Protein ID (Name)
		Source Signature Database
		Signature Accession
		Signature ID (Name)
		Match Start Position
		Match Stop Position

Open in new tab

The InterPro BioMart is federated with the Reactome BioMart (22, 23) from which the ‘pathway’ data set derives. Reactome describes ‘reactions, pathways and biological processes’ and as such, can provide valuable biological insight if married to the data in InterPro.

Query #4. ‘In which tissues have proteins matching the InterPro family “Neural cell adhesion” (IPR009138) been identified by mass spectrometry?’

Data sets	Filters	Attributes
PRIDE		PRIDE Experiment Accession
		Experiment Title
		Sample Name
		Taxonomy Term (NEWT/NCBI Taxon)
		Taxonomy ID (NEWT/NCBI Taxon)
		Tissue Ontology Term (BRENDA)
		BRENDA ID (Tissue)
		Cell Type Term (CL)
		CL ID (Cell Type)
		Gene Ontology Term (GO)
		GO ID (Gene Ontology)
InterPro Entries	InterPro Entry ID = ‘IPR009138’	InterPro Entry Accession
		InterPro Entry Name

Data sets	Filters	Attributes
PRIDE		PRIDE Experiment Accession
		Experiment Title
		Sample Name
		Taxonomy Term (NEWT/NCBI Taxon)
		Taxonomy ID (NEWT/NCBI Taxon)
		Tissue Ontology Term (BRENDA)
		BRENDA ID (Tissue)
		Cell Type Term (CL)
		CL ID (Cell Type)
		Gene Ontology Term (GO)
		GO ID (Gene Ontology)
InterPro Entries	InterPro Entry ID = ‘IPR009138’	InterPro Entry Accession
		InterPro Entry Name

Open in new tab

Data sets	Filters	Attributes
PRIDE		PRIDE Experiment Accession
		Experiment Title
		Sample Name
		Taxonomy Term (NEWT/NCBI Taxon)
		Taxonomy ID (NEWT/NCBI Taxon)
		Tissue Ontology Term (BRENDA)
		BRENDA ID (Tissue)
		Cell Type Term (CL)
		CL ID (Cell Type)
		Gene Ontology Term (GO)
		GO ID (Gene Ontology)
InterPro Entries	InterPro Entry ID = ‘IPR009138’	InterPro Entry Accession
		InterPro Entry Name

Data sets	Filters	Attributes
PRIDE		PRIDE Experiment Accession
		Experiment Title
		Sample Name
		Taxonomy Term (NEWT/NCBI Taxon)
		Taxonomy ID (NEWT/NCBI Taxon)
		Tissue Ontology Term (BRENDA)
		BRENDA ID (Tissue)
		Cell Type Term (CL)
		CL ID (Cell Type)
		Gene Ontology Term (GO)
		GO ID (Gene Ontology)
InterPro Entries	InterPro Entry ID = ‘IPR009138’	InterPro Entry Accession
		InterPro Entry Name

Open in new tab

The InterPro BioMart is also federated with the PRIDE BioMart. PRIDE is the ‘Proteomics Identifications Database’, which contains identifications of proteins and peptides arising from mass spectrometry. The two BioMarts are linked via UniProtKB protein accessions, so this query returns information about identifications of the proteins that match the member database signatures integrated into InterPro Entry IPR009138.

Discussion and future directions

The InterPro BioMart has proven a valuable addition to the InterPro software infrastructure, supporting new tools, such as the InterPro DAS service, as well as providing an efficient route to answer queries from the InterPro user community. The BioMart has furnished InterPro with a web service, for which robust APIs exist in several languages (including both Perl and Java).

Additionally, BioMart provides a substantial resource for bioinformaticians to query InterPro, alongside the federated databases UniProtKB, Reactome and PRIDE. (See Table 1, which describes these Bioinformatics resources).

Table 1.

External data sources included in the InterPro BioMart

Source	URL	BioMart URL	Description of contents
UniProtKB	http://www.uniprot.org	http://www.ebi.ac.uk/ uniprot/biomart/martview	A comprehensive, high quality and freely accessible resource of protein sequence and functional information, comprising the human-curated Swiss-Prot data set and the automatically annotated TrEMBL data set.
PRIDE	http://www.ebi.ac.uk/pride	http://www.ebi.ac.uk/pride/biomart/martview	A database of identifications of proteins and peptides, arising from mass spectrometry-based proteomics.
Reactome Pathway Database	http://www.reactome.org	http://www.reactome.org/cgi-bin/mart	A human-curated database of biological pathways, focusing on human pathways, but providing automated prediction of pathways in other species.

Source	URL	BioMart URL	Description of contents
UniProtKB	http://www.uniprot.org	http://www.ebi.ac.uk/ uniprot/biomart/martview	A comprehensive, high quality and freely accessible resource of protein sequence and functional information, comprising the human-curated Swiss-Prot data set and the automatically annotated TrEMBL data set.
PRIDE	http://www.ebi.ac.uk/pride	http://www.ebi.ac.uk/pride/biomart/martview	A database of identifications of proteins and peptides, arising from mass spectrometry-based proteomics.
Reactome Pathway Database	http://www.reactome.org	http://www.reactome.org/cgi-bin/mart	A human-curated database of biological pathways, focusing on human pathways, but providing automated prediction of pathways in other species.

Open in new tab

Table 1.

External data sources included in the InterPro BioMart

Source	URL	BioMart URL	Description of contents
UniProtKB	http://www.uniprot.org	http://www.ebi.ac.uk/ uniprot/biomart/martview	A comprehensive, high quality and freely accessible resource of protein sequence and functional information, comprising the human-curated Swiss-Prot data set and the automatically annotated TrEMBL data set.
PRIDE	http://www.ebi.ac.uk/pride	http://www.ebi.ac.uk/pride/biomart/martview	A database of identifications of proteins and peptides, arising from mass spectrometry-based proteomics.
Reactome Pathway Database	http://www.reactome.org	http://www.reactome.org/cgi-bin/mart	A human-curated database of biological pathways, focusing on human pathways, but providing automated prediction of pathways in other species.

Source	URL	BioMart URL	Description of contents
UniProtKB	http://www.uniprot.org	http://www.ebi.ac.uk/ uniprot/biomart/martview	A comprehensive, high quality and freely accessible resource of protein sequence and functional information, comprising the human-curated Swiss-Prot data set and the automatically annotated TrEMBL data set.
PRIDE	http://www.ebi.ac.uk/pride	http://www.ebi.ac.uk/pride/biomart/martview	A database of identifications of proteins and peptides, arising from mass spectrometry-based proteomics.
Reactome Pathway Database	http://www.reactome.org	http://www.reactome.org/cgi-bin/mart	A human-curated database of biological pathways, focusing on human pathways, but providing automated prediction of pathways in other species.

Open in new tab

It is intended to federate the InterPro BioMart with the new UniParc BioMart that is under development at the EBI. This will allow the InterPro BioMart to be queried using identifiers and accessions from a large variety of protein sequence databases other than the UniProtKB, including several model organism databases.

Funding

Biotechnology and Biological Sciences Research Council's Bioinformatics and Biological Resources Fund (grant number BB/F010508/1); European Union under the program ‘FP7 capacities: Scientific Data Repositories’; The working title for the project is IMproving Protein Annotation and Co-ordination using Technology (IMPACT) (grant number 213037). Funding for open access charge: European Union under the program ‘FP7 capacities: Scientific Data Repositories’; The working title for the project is IMproving Protein Annotation and Co-ordination using Technology (IMPACT) (grant number 213037).

Conflict of interest. None declared.

Acknowledgements

The authors would particularly like to acknowledge the continuing support of the InterPro Consortium member databases and the support of the BioMart development team who have given invaluable guidance and assistance with constructing the InterPro BioMart.

References

1

Hunter

S

,

Apweiler

R

,

Attwood

TK

, et al.

InterPro: the integrative protein signature database

,

Nucleic Acids Res.

,

2009

, vol.

37

(pg.

D211

-

D215

)

2

Lees

J

,

Yeats

C

,

Redfern

O

, et al.

Gene3D: merging structure and function for a thousand genomes

,

Nucleic Acids Res.

,

2010

, vol.

38

(pg.

D296

-

D300

)

3

Lima

T

,

Auchincloss

AH

,

Coudert

E

, et al.

HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot

,

Nucleic Acids Res.

,

2009

, vol.

37

(pg.

D471

-

D478

)

4

Thomas

PD

,

Campbell

MJ

,

Kejariwal

A

, et al.

PANTHER: a library of protein families and subfamilies indexed by function

,

Genome Res.

,

2003

, vol.

13

(pg.

2129

-

2141

)

5

Finn

RD

,

Mistry

J

,

Tate

J

, et al.

The Pfam protein families database

,

Nucleic Acids Res.

,

2010

, vol.

38

(pg.

D211

-

D222

)

6

Wu

CH

,

Nikolskaya

A

,

Huang

H

, et al.

PIRSF: family classification system at the Protein Information Resource

,

Nucleic Acids Res.

,

2004

, vol.

32

(pg.

D112

-

D114

)

7

Attwood

TK

,

Mitchell

A

,

Gaulton

A

, et al.

Dunn

M

,

Jorde

L

,

Little

P

,

Subramaniam

A

.

The PRINTS protein fingerprint database: functional and evolutionary applications

,

Encyclopaedia of Genetics, Genomics, Proteomics and Bioinformatics

,

2006

Hoboken, NJ, USA

John Wiley & Sons Ltd

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

8

Servant

F

,

Bru

C

,

Carrère

S

, et al.

ProDom: automated clustering of homologous domains

,

Brief. Bioinf.

,

2002

, vol.

3

(pg.

246

-

251

)

Google Scholar

Crossref

WorldCat

9

Sigrist

CJA

,

Cerutti

L

,

Castro

Ede

, et al.

PROSITE, a protein domain database for functional characterization and annotation

,

Nucleic Acids Res.

,

2010

, vol.

38

(pg.

D161

-

D166

)

10

Letunic

I

,

Doerks

T

,

Bork

P

.

SMART 6: recent updates and new developments

,

Nucleic Acids Res.

,

2009

, vol.

37

(pg.

D229

-

D232

)

11

Wilson

D

,

Pethica

R

,

Zhou

Y

, et al.

SUPERFAMILY–sophisticated comparative genomics, data mining, visualization and phylogeny

,

Nucleic Acids Res.

,

2009

, vol.

37

(pg.

D380

-

D386

)

12

Selengut

JD

,

Haft

DH

,

Davidsen

T

, et al.

TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes

,

Nucleic Acids Res.

,

2007

, vol.

35

(pg.

D260

-

D264

)

13

The UniProt Consortium

Ongoing and future developments at the Universal Protein Resource

,

Nucleic Acids Res.

,

2010

, vol.

39

(pg.

D214

-

D219

)

PubMed

OpenURL Placeholder Text

WorldCat

14

Smedley

D

,

Haider

S

,

Ballester

B

, et al.

BioMart–biological queries made easy

,

BMC Genomics

,

2009

, vol.

10

pg.

22

15

Zhang

J

,

Haider

S

,

Guberman

J

, et al.

BioMart: a data federation framework for large collaborative projects

,

Database

,

2011

(this special edition)

Google Scholar

OpenURL Placeholder Text

WorldCat

16

Guberman

JM

, et al.

BioMart Central Portal: an open database network for biological community

,

Database

,

2011

(this special edition)

Google Scholar

OpenURL Placeholder Text

WorldCat

17

Flicek

P

,

Amode

MR

,

Barrell

D

, et al.

Ensembl 2011

,

Nucleic Acids Res.

,

2010

, vol.

39

(pg.

D800

-

D806

)

18

Kinsella

R

, et al.

The Ensembl Mart

,

Database

,

2011

(this special edition)

Google Scholar

OpenURL Placeholder Text

WorldCat

19

Ashburner

M

,

Ball

CA

,

Blake

JA

, et al.

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium

,

Nat. Genet.

,

2000

, vol.

25

(pg.

25

-

29

)

20

Côté

RG

,

Jones

P

,

Martens

L

, et al.

The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases

,

BMC Bioinformatics

,

2007

, vol.

8

pg.

401

21

Jenkinson

AM

,

Albrecht

M

,

Birney

E

, et al.

Integrating biological data–the Distributed Annotation System

,

BMC Bioinformatics

,

2008

, vol.

9

Suppl 8

pg.

S3

22

Croft

D

,

ÓKelly

G

,

Wu

G

, et al.

Reactome: a database of reactions, pathways and biological processes

,

Nucleic Acids Res.

,

2010

, vol.

39

(pg.

D691

-

D697

)

23

Haw

R

,

Croft

D

,

Yung

CK

, et al.

The Reactome BioMart

,

Database

,

2011

(this special edition)

Google Scholar

OpenURL Placeholder Text

WorldCat

This is Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
December 2016	3
January 2017	3
February 2017	2
April 2017	2
May 2017	8
June 2017	3
July 2017	2
August 2017	2
September 2017	3
October 2017	5
November 2017	4
December 2017	11
January 2018	17
February 2018	10
March 2018	22
April 2018	17
May 2018	15
June 2018	9
July 2018	11
August 2018	20
September 2018	22
October 2018	7
November 2018	15
December 2018	15
January 2019	9
February 2019	8
March 2019	15
April 2019	14
May 2019	18
June 2019	29
July 2019	20
August 2019	22
September 2019	26
October 2019	19
November 2019	13
December 2019	15
January 2020	13
February 2020	14
March 2020	11
April 2020	10
May 2020	9
June 2020	17
July 2020	17
August 2020	19
September 2020	12
October 2020	18
November 2020	15
December 2020	183
January 2021	173
February 2021	114
March 2021	292
April 2021	132
May 2021	113
June 2021	56
July 2021	137
August 2021	312
September 2021	245
October 2021	123
November 2021	82
December 2021	102
January 2022	111
February 2022	61
March 2022	118
April 2022	85
May 2022	107
June 2022	174
July 2022	33
August 2022	19
September 2022	13
October 2022	25
November 2022	12
December 2022	14
January 2023	12
February 2023	15
March 2023	34
April 2023	30
May 2023	105
June 2023	83
July 2023	102
August 2023	71
September 2023	81
October 2023	37
November 2023	10
December 2023	18
January 2024	32
February 2024	27
March 2024	35
April 2024	4

Article Contents

The InterPro BioMart: federated query and web service access to the InterPro Resource

Abstract

Introduction

Data content

Services supported by the InterPro BioMart

Query examples

To demonstrate the utility of the InterPro BioMart, here we present several biologically relevant queries

Discussion and future directions

Funding

Acknowledgements

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

The InterPro BioMart: federated query and web service access to the InterPro Resource

Abstract

Introduction

Data content

Services supported by the InterPro BioMart

Query examples

To demonstrate the utility of the InterPro BioMart, here we present several biologically relevant queries

Discussion and future directions

Funding

Acknowledgements

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only