Document Type
Article
Publication Date
1-1-2007
Department
Biological Sciences
School
Biological, Environmental, and Earth Sciences
Abstract
Background
Since the high dimensionality of gene expression microarray data sets degrades the generalization performance of classifiers, feature selection, which selects relevant features and discards irrelevant and redundant features, has been widely used in the bioinformatics field. Multi-task learning is a novel technique to improve prediction accuracy of tumor classification by using information contained in such discarded redundant features, but which features should be discarded or used as input or output remains an open issue.
Results
We demonstrate a framework for automatically selecting features to be input, output, and discarded by using a genetic algorithm, and propose two algorithms: GA-MTL (Genetic algorithm based multi-task learning) and e-GA-MTL (an enhanced version of GA-MTL). Experimental results demonstrate that this framework is effective at selecting features for multi-task learning, and that GA-MTL and e-GA-MTL perform better than other heuristic methods.
Conclusions
Genetic algorithms are a powerful technique to select features for multi-task learning automatically; GA-MTL and e-GA-MTL are shown to to improve generalization performance of classifiers on microarray data sets.
Publication Title
BMC Genomics
Volume
9
Issue
S1
First Page
1
Last Page
12
Recommended Citation
Yang, J. Y.,
Li, G.,
Meng, H.,
Yang, M. Q.,
Deng, Y.
(2007). Improving Prediction Accuracy of Tumor Classification by Reusing Genes Discarded During Gene Selection. BMC Genomics, 9(S1), 1-12.
Available at: https://aquila.usm.edu/fac_pubs/8473
Comments
Creative Commons Attribution License
Publisher's Version