Developing a biocuration workflow for AgBase, a non-model organism database.

Lakshmi Pillai, Philippe Chouvarine, Catalina O. Tudor, Carl J. Schmidt, K. Vijay-Shanker, Fiona M. McCarthy

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify literature for curation. The first component of our annotation interface is the gene prioritization interface that ranks gene products for annotation. Biocurators select the top-ranked gene and mark annotation for these genes as 'in progress' or 'completed'; links enable biocurators to move directly to our biocuration interface (BI). Our BI includes all current GO annotation for gene products and is the main interface to add/modify AgBase curation data. The BI also displays Extracting Genic Information from Text (eGIFT) results for each gene product. eGIFT is a web-based, text-mining tool that associates ranked, informative terms (iTerms) and the articles and sentences containing them, with genes. Moreover, iTerms are linked to GO terms, where they match either a GO term name or a synonym. This enables AgBase biocurators to rapidly identify literature for further curation based on possible GO terms. Because most agricultural species do not have standardized literature, eGIFT searches all gene names and synonyms to associate articles with genes. As many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene, and filtering is applied to remove abstracts that mention a gene in passing. The BI is linked to our Journal Database (JDB) where corresponding journal citations are stored. Just as importantly, biocurators also add to the JDB citations that have no GO annotation. The AgBase BI also supports bulk annotation upload to facilitate our Inferred from electronic annotation of agricultural gene products. All annotations must pass standard GO Consortium quality checking before release in AgBase. Database URL:

Original languageEnglish (US)
Pages (from-to)bas038
JournalDatabase : the journal of biological databases and curation
StatePublished - 2012

ASJC Scopus subject areas

  • Information Systems
  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)


Dive into the research topics of 'Developing a biocuration workflow for AgBase, a non-model organism database.'. Together they form a unique fingerprint.

Cite this