Summer Journal Club:

Meeting time: 3:30pm, Thursdays, 3211 Digital Computer Lab.

Schedule:

May 11: Jun -- Context-Specific Bayesian Clustering for Gene Expression Data.

Another related paper by the same author:

The Bayesian Structural EM Algorithm.

May 18: Guixian -- Semi-Supervised Methods to Predict Patient Survival from Gene Expression Data.

Another related paper by the same author:

Prediction by supervised principal components.

May 25: Zhewen -- Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data.

Another related paper by the same author:

(1).Genome-wide discovery of transcriptional modules from DNA sequence and gene expression.

(2).Learning Module Networks.

June 1: Xin -- Evolutionary population genetics of promoters: Predicting binding sites and functional phylogenies.

June 8: Feng -- Dirichlet Process Mixture Models. Paper: Neal, 2000, Markov Chain Sampling Methods for Dirichlet Process Mixture Models.

Supplement materials:

(1).The Dirichlet Process Mixture (DPM) Model.

(2).Dirichlet processes, Chinese restaurant processes and all that.

June 15: Jun -- Ab Initio Prediction of Transcription Factor Targets Using Structural Knowledge.

June 22: Guixian -- (1).Learning Module Networks. (2).The Bayesian structural EM algorithm.

July 13: Xinguang -- (1).Sharp developmental thresholds defined through bistability by antagonistic gradients of retinoic acid and FGF signaling.

(2).Autoinhibition with Transcriptional Delay: A Simple Mechanism for the Zebrafish Somitogenesis Oscillator.

July 20: Xin -- (1).Toward a Neutral Evolutionary Model of Gene Expression.

(2).A Neutral Model of Transcriptome Evolution.

July 26: (11:00 am) Hongling --Fast Computation of Large Numbers of LOD Scores for Mapping Human Disease Genes

Abstract:

  LOD score method is commonly used in genetic linkage analysis to associate functionality of genes to their locations on chromosomes. However, LOD score method for linkage analysis assumes the genetic parameters are known at each locus. This may be not possible at the disease loci. One way to address the case of linkage studies for diseases with unknown mode of inheritance, is to maximize the LOD score function over all genetic parameters to get a maximized maximum LOD score or MOD. Another way to address the case of linkage studies for diseases with unknown mode of inheritance is to integrate the LOD score across the genetic parameters to form a posterior probability of linkage or PPL. No matter which of the two ways mentioned above is used to deal with unknown mode of inheritance, a typical linkage analysis involves computing likelihoods in several parameters. Therefore, we may need to calculate billions of LOD scores in the course of a typical linkage analysis. As a result, these calculations form a significant bottleneck in disease gene mapping on the genome. Here, we propose an alternative computational approach, replacing the usual LOD calculation based on fixed parameter values by an algebraic expression, essentially "compiling" the calculation over a pedigree into a symbolic form and then evaluating the compiled expression with different parameter values. Our initial results show that this approach can speed up the traditional genetic linkage computation by 10~1000 times.

July 27: Feng -- Clustering microarray gene expression data using weighted Chinese restaurant process.

 

Potentially interesting papers:

Cross-species analysis of biological networks by Bayesian alignment. Mike Lassig, PNAS 2006

A Systems Approach to Mapping DNA Damage Response Pathways. Trey Ideker, Science 2006