A time series analysis method using hidden variables for gene network reconstruction

Xi Wu

Abstract

The DNA microarray technology can be applied to obtain time series data which contains thousands of genes and tens of time points. When confront the great amount of data points a fast and effective method must be constructed to extract useful information. The assumption that the interactions between genes are static in the time series data is made. After made the assumption how to reconstruct those interactions becomes a difficulty problem. Since the underlying interactions between genes are complicated, which involve transcription, translation and protein-protein interaction, to construct a model from physicochemistry is almost impossible/effortless. The popular methods constructed from statistical or mathematical principles are discussed. Basically says, those methods are trying to minimize (maximize) some criteria to obtain values of parameters in those models. In this thesis we mainly focus on linear equation models and how to construct the gene network from those models. One difficulty for reconstructed models is large amount of genes and small amount of time points. For the purpose of decreasing the number of parameters in linear equation model, some new linear equation models with hidden variables are introduced. Those models can effectively decrease the number of parameters and increase the inference accuracy. In comparison, the famous Boolean Network and Probability Boolean Network are introduced and used to run the simulation.