A Review of Feature Reduction Methods for QSAR-Based Toxicity Prediction
Document Type
Article
Publication Date
5-21-2019
Department
Computing
School
Computing Sciences and Computer Engineering
Abstract
Thousands of molecular descriptors (1D to 4D) can be generated and used as features to model quantitative structure–activity or toxicity relationship (QSAR or QSTR) for chemical toxicity prediction. This often results in models that suffer from the “curse of dimensionality”, a problem that can occur in machine learning practice when too many features are employed to train a model. Here we discuss different methods of eliminating redundant and irrelevant features to enhance prediction performance, increase interpretability, and reduce computational complexity. Several feature selection and extraction methods are summarized along with their strengths and shortcomings. We also highlight some commonly overlooked challenges such as algorithm instability and selection bias while offering possible solutions.
Publication Title
Challenges and Advances in Computational Chemistry and Physics
Volume
30
First Page
119
Last Page
139
Recommended Citation
Isakwo, G.,
Luttrell, J.,
Chen, M.,
Hong, H.,
Gong, P.,
Zhang, C.
(2019). A Review of Feature Reduction Methods for QSAR-Based Toxicity Prediction. Challenges and Advances in Computational Chemistry and Physics, 30, 119-139.
Available at: https://aquila.usm.edu/fac_pubs/16447
COinS