Prediction of gene function in mammals


Gene function in mammals can be quickly and reliably predicted using a high-throughput analysis of patterns of RNA expression, according to an article published today in Journal of Biology. This challenges the conventional view that tissue-specificity is the best predictor of function, and could speed up the quest to understand whole genomes, in humans and other mammals, by decades. The authors have made their mouse dataset openly accessible online to the research community.

Tim Hughes and colleagues from the University of Toronto, Canada, looked at the mouse genome using a technique previously only applied to simple organisms such as yeast and the nematode worm C. elegans. In yeast and other simple organisms, the expression of genes with similar functions tends to be coordinately regulated. In these organisms, identifying correlated expression of known and unknown genes can help predicting the function of a novel gene. It has been assumed that this strategy couldn't be applied to mammals, but instead that genes expressed in the same tissue are most likely to have a functional relationship, making tissue-specificity the best indicator of function.

In an experiment that challenges this view, Hughes and colleagues created and analysed a microarray panel of over 40,000 known mouse mRNAs, expressed in 55 tissues. Their results showed that genes from the same Gene Ontology 'Biological Process' (GO-BP) category which indicates the physiological function of their encoded protein, such as 'response to temperature' or 'amino acid metabolism' - are transcriptionally co-regulated, independent of the tissue in which they are expressed.

To show that this approach could be used to predict novel gene function, the team then carried out a co-expression analysis on genes of unknown function. They analysed the microarray results using a machine learning computational algorithm called a support vector machine (SVM). SVMs had never been used on this scale before: the programme analysed over 12,000 genes and predicted a function, out of 587 GO-BP categories, for each of them. A number of predictions resulting from the SVM analysis were confirmed by results that are already in the literature, and in the case of one gene of unknown function, P1W1, by directed experimentation. A highly conserved yeast homologue of P1W1 protein was shown to act biochemically as would be expected for a protein with a role in RNA processing, as predicted by the algorithm.

"We examined the extent to which [transcriptional co-expression] is effective for our data, and we show that it yields almost universally superior predictions of gene function in comparison to using information regarding simple tissue specificity or tissue restriction" say the authors.

This new, quick, high-throughput method for predicting mammalian gene function merely from the pattern of RNA expression could make tissue specificity based predictions a thing of the past and revolutionize the field of functional genomics. The results of the study also hint at a more complex transcriptional control in mammals, whereby transcription factors may be regulating the transcription of functionally related genes across different tissues.

Source: Eurekalert & others

Last reviewed: By John M. Grohol, Psy.D. on 21 Feb 2009
    Published on All rights reserved.



Excess on occasion is exhilirating. It prevents moderation from acquiring the deadening effect of a habit.
-- William Somerset Maugham