The Critical Feature Dimension and Critical Sampling Problems
Document Type
Conference Proceeding
Publication Date
1-1-2015
School
Computing Sciences and Computer Engineering
Abstract
Efficacious data mining methods are critical for knowledge discovery in various applications in the era of big data. Two issues of immediate concern in big data analytic tasks are how to select a critical subset of features and how to select a critical subset of data points for sampling. This position paper presents ongoing research by the authors that suggests: 1. the critical feature dimension problem is theoretically intractable, but simple heuristic methods may well be sufficient for practical purposes; 2. there are big data analytic problems where the success of data mining depends more on the critical feature dimension than the specific features selected, thus a random selection of the features based on the dataset's critical feature dimension will prove sufficient; and 3. The problem of critical sampling has the same intractable complexity as critical feature dimension, but again simple heuristic methods may well be practicable in most applications.
Publication Title
ICPRAM 2015 - 4th International Conference on Pattern Recognition Applications and Methods, Proceedings
Volume
1
First Page
360
Last Page
366
Recommended Citation
Ribeiro, B.,
Sung, A.,
Suryakumar, D.,
Basnet, R.
(2015). The Critical Feature Dimension and Critical Sampling Problems. ICPRAM 2015 - 4th International Conference on Pattern Recognition Applications and Methods, Proceedings, 1, 360-366.
Available at: https://aquila.usm.edu/fac_pubs/18794
COinS