Click here for full text:
Unlabeled Data Can Degrade Classification Performance of Generative Classifiers
Cozman, Fabio G.; Cohen, Ira
Keyword(s): semi-supervised learning; labeled and unlabeled data problem; classification; maximum-likelihood estimation; EM algorithm
Abstract: This report analyzes the effect of unlabeled training data in generative classifiers. We are interested in classification performance when unlabeled data are added to an existing pool of labeled data. We show that there are situations where unlabeled data can degrade the performance of a classifier. We present an analysis of these situations and explain several seemingly disparate results in the literature.
Back to Index