Biologists often conduct multiple independent gene expression studies that all target the same biological system or pathway. Pooling information across studies can help more accurately identify true target genes. Here, we introduce a Bayesian hierarchical model to combine gene expression data across studies to identify differentially expressed genes. Each study has several sources of variation, i.e. replicate slides within repeated experiments. Our model produces the gene-specific posterior probability of differential expression, which is the basis for inference. We further develop the models to identify up- and down-regulated genes separately, and by including gene dependence information. We evaluate the models using both simulation data and biological data for the model organisms Bacillus subtilis and Geobacter sulfurreducens.
The Department of Quantitative Health Sciences and the Quantitative Methods Core will conduct monthly seminars to explore statistical issues of general interest.