|
Mark Girolami and Rainer
Breitling
Biologically Valid Linear Factor Models of Gene Expression
to appear Bioinformatics, 2004
The identification of
physiological processes underlying and generating the expression pattern
observed in microarray experiments is a major challenge. Principal
Component Analysis (PCA) is a linear multivariate statistical method that
is regularly employed for that purpose as it provides a reduced-dimensional
representation for subsequent study of possible biological processes
responding to the particular experimental conditions. Making explicit the
data assumptions underlying PCA highlights their lack of biological
validity thus making biological interpretation of the principal components
problematic. A microarray data representation which enables clear
biological interpretation is a desirable analysis tool. We address this
issue by employing the probabilistic interpretation of Principal Component
Analysis and proposing alternative Linear Factor Models which are based on
refined biological assumptions. A practical study on two well-understood
microarray data sets highlights the weakness of Principal Component
Analysis and the greater biological interpretability of the linear models
we have developed.
Matlab Code
Download
the following zipped file. LFM_DEMO.zip
Unzip the file to
your chosen directory then startup Matlab ensuring the directory is defined
in your path. To view the demo simply type efm_demo at the Matlab command line.
|