Feature Shaping for Linear SVM Classifiers
Forman, George; Scholz, Martin; Rajaram, Shyamsundar
Keyword(s): text classification machine learning, feature weighting, feature scaling, SVM
Abstract: Linear classifiers have been shown to be effective for many discrimination tasks. Irrespective of the learning algorithm itself, the final classifier has a weight to multiply by each feature. This suggests that ideally each input feature should be linearly correlated with the target variable (or anti- correlated), whereas raw features may be highly non- linear. In this paper, we attempt to re-shape each input feature so that it is appropriate to use with a linear weight and to scale the different features in proportion to their predictive value. We demonstrate that this pre-processing is beneficial for linear SVM classifiers on a large benchmark of text classification tasks as well as UCI datasets.
Additional Publication Information: Published and presented at 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'09) Paris France, June 28-July 1, 2009
External Posting Date: May 6, 2009 [Fulltext]. Approved for External Publication
Internal Posting Date: May 6, 2009 [Fulltext]