Faculty Publications

Incremental Genetic K-means Algorithm and its Application in Gene Expression Data Analysis

Yi Lu, Wayne State UniversityFollow
Shiyong Lu, Wayne State UniversityFollow
Farhad Fotouhi, Wayne State UniversityFollow
Youping Deng, University of Southern MississippiFollow
Susan J. Brown, Kansas State UniversityFollow

Document Type

Article

Publication Date

10-28-2004

Department

Biological Sciences

School

Biological, Environmental, and Earth Sciences

Abstract

Background

In recent years, clustering algorithms have been effectively applied in molecular biology for gene expression data analysis. With the help of clustering algorithms such as K-means, hierarchical clustering, SOM, etc, genes are partitioned into groups based on the similarity between their expression profiles. In this way, functionally related genes are identified. As the amount of laboratory data in molecular biology grows exponentially each year due to advanced technologies such as Microarray, new efficient and effective methods for clustering must be developed to process this growing amount of biological data.

Results

In this paper, we propose a new clustering algorithm, Incremental Genetic K-means Algorithm (IGKA). IGKA is an extension to our previously proposed clustering algorithm, the Fast Genetic K-means Algorithm (FGKA). IGKA outperforms FGKA when the mutation probability is small. The main idea of IGKA is to calculate the objective value Total Within-Cluster Variation (TWCV) and to cluster centroids incrementally whenever the mutation probability is small. IGKA inherits the salient feature of FGKA of always converging to the global optimum. C program is freely available at http://database.cs.wayne.edu/proj/FGKA/index.htm.

Conclusions

Our experiments indicate that, while the IGKA algorithm has a convergence pattern similar to FGKA, it has a better time performance when the mutation probability decreases to some point. Finally, we used IGKA to cluster a yeast dataset and found that it increased the enrichment of genes of similar function within the cluster.

Comments

Published by BMC Bioinformatics at 10.1186/1471-2105-5-172.

Publication Title

BMC Bioinformatics

Volume

Issue

172

First Page

Last Page

Recommended Citation

Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S. J. (2004). Incremental Genetic K-means Algorithm and its Application in Gene Expression Data Analysis. BMC Bioinformatics, 5(172), 1-10.
Available at: https://aquila.usm.edu/fac_pubs/8598

Download

Find in your library

Included in

Bioinformatics Commons

COinS

Faculty Publications

Incremental Genetic K-means Algorithm and its Application in Gene Expression Data Analysis

Document Type

Publication Date

Department

School

Abstract

Background

Results

Conclusions

Comments

Publication Title

Volume

Issue

First Page

Last Page

Recommended Citation

Included in

Search

Browse

Author Corner

Faculty Publications

Incremental Genetic K-means Algorithm and its Application in Gene Expression Data Analysis

Authors

Document Type

Publication Date

Department

School

Abstract

Background

Results

Conclusions

Comments

Publication Title

Volume

Issue

First Page

Last Page

Recommended Citation

Included in

Share

Search

Browse

Author Corner