INFORMS Journal on Computing
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
 QUICK SEARCH:   [advanced]


     


INFORMS JOURNAL ON COMPUTING,
Published online in Articles in Advance, October 2, 2009
DOI: 10.1287/ijoc.1090.0356
This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Google Scholar
Right arrow Articles by Jupiter, D.
Right arrow Articles by VanBuren, V.

TreeHugger: A New Test for Enrichment of Gene Ontology Terms

Daniel Jupiter, Jessica Sahutoglu, Vincent VanBuren

Department of Systems Biology and Translational Medicine, College of Medicine, Texas A&M Health Science Center, Temple, Texas 76504
Washtenaw Community Health Organization, Ypsilanti, Michigan 48198
Department of Systems Biology and Translational Medicine, College of Medicine, Texas A&M Health Science Center, Temple, Texas 76504

djupiter{at}tamu.edu
sahutj{at}ewashtenaw.org
vanburen{at}tamu.edu

The Gene Ontology (GO) project provides a structured vocabulary of biological terms used by biological researchers as a tool for standardization of references to biological entities. Genes may be annotated with GO terms to indicate their roles or localizations in the cell. GO has been used in conjunction with high-throughput experimental methods, such as microarrays. In this setting, the interest is to determine whether sets of genes identified by the high-throughput experiment are enriched for GO terms: Do certain terms annotate more genes in the identified set than one might expect? Enriched terms are taken as a potential summary of the cellular function for the identified set of genes and may provide clues leading to new directions for investigation. Current methods for determining whether sets of genes are GO-enriched have certain well-known shortcomings. Many methods do not take the hierarchical structure of the ontology into account in determining enrichment. We address this drawback by introducing a new statistical test (TreeHugger) based on a novel per-gene scoring scheme for GO terms. Given a set of genes and a specified subset of those genes, our method determines enrichment of GO terms in the subset, taking into account the structure of the ontology and ascribing a lower weight to those terms that do not themselves directly annotate the given genes. Tests on simulated and real data indicate that our method is a conservative test for enrichment. Testing TreeHugger on a biological example reveals that it also reduces the redundancy caused by giving high scores to indirect annotations as provided by standard enrichment tests.

Key words: statistics; data analysis; probability; genomics; microarray
History: received June 2008; revised May 2009; accepted July 2009.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
Copyright © 2009 by INFORMS.