Quadratic regression

3/20/2023

This approach takes into account time information because time is treated as a continuous variable. In this paper, we propose a model-based approach, step down quadratic regression, for gene identification and pattern recognition in non-cyclic short time-course microarray data. New methods for analyzing short time-course microarray data are needed. Both Luan and Li (2003) and Bar-Joseph et al (2003) proposed B-splines based approaches, which are appropriate for microarray data with relatively long time-course, but their application to short time-course data is questionable. (2002) applied a piecewise regression model to identify differentially expressed genes. Recently, a number of algorithms treating time as a continuous variable have been introduced. (2003) proposed a method for gene selection and clustering using order-restricted inference, which preserves the ordering of time but treats time as nominal. However, most methods for analyzing microarray time-course data treat time as a nominal variable rather than a continuous variable, and thus ignore the actual times at which these points were sampled. Since time can affect the gene expression levels, it is important to preserve time information in time-course data analysis. In microarray time-course studies, time dependency of gene expression levels is usually of primary interest. An alternative way of clustering is statistical model-based clustering methods, which assume that the data is from a mixture of probability distributions such as multivariate normal distributions and describe each cluster using a probabilistic model. Fitting statistical models prior to clustering usually dramatically reduces the number of genes used for clustering, which in general will improve the performance of the clustering method. Only genes identified to be significantly regulated by statistical models are used for further clustering. Due to the lack of statistical properties of these heuristic-based clustering methods, statistical models, especially analysis of variance (ANOVA) models and mixed models are often implemented as a precursor to clustering to ensure the genes used for clustering are statistically meaningful. Heuristic-based cluster analyses group genes based on distance measures the most commonly used methods include hierarchical clustering, k-means clustering, self-organizing maps, and support vector machines. Due to the large number of genes involved and the complexity of gene regulatory networks, clustering analyses are popular for analyzing microarray time-course data.

The premise for pattern analysis is that genes sharing similar expression profiles might be functionally related or co-regulated. Microarray time-course experiments allow researchers to explore the temporal expression profiles for thousands of genes simultaneously. With a freely accessible Excel macro, investigators can readily apply this method to their microarray data. Our results demonstrate that the proposed quadratic regression method improves gene discovery and pattern recognition for non-cyclic short time-course microarray data. Reliability study indicates that regression patterns have the highest reliabilities. Comparison with Peddada et al.'s order-restricted inference method showed that our method provides a different perspective on the temporal gene profiles. EASE analysis identified over-represented functional groups in each regression pattern and each k-means cluster, which further demonstrated that the regression method provided more biologically meaningful classifications of gene expression profiles than the k-means clustering method. Nine regression patterns have been identified and shown to fit gene expression profiles better than k-means clusters. We applied this method to a microarray time-course study of gene expression at short time intervals following deafferentation of olfactory receptor neurons. This method treats time as a continuous variable, therefore preserves actual time information. We propose a quadratic regression method for identification of differentially expressed genes and classification of genes based on their temporal expression profiles for non-cyclic short time-course microarray data. However, in general, these methods do not take advantage of the fact that time is a continuous variable, and existing clustering methods often group biologically unrelated genes together.

Cluster analyses are used to analyze microarray time-course data for gene discovery and pattern recognition.

0 Comments

Quadratic regression

Leave a Reply.

Author

Archives

Categories