Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP

hp.com home

Technical Reports

printable version

HP Labs

» Research
» News and events
» Technical reports
» About HP Labs
» Careers @ HP Labs
» Worldwide sites
» Downloads
Content starts here

  Click here for full text: PDF

The effect of unlabeled data on generative classifiers, with application to model selection

Cohen, Ira; Cozman, Fabio G.; Bronstein, Alexandre


Keyword(s): semi-supervised learning; labeled and unlabeled data problem; classification; machine learning

Abstract: In this paper we investigate the effect of unlabeled data on generative classifiers in semi-supervised learning. We first characterize situations where unlabeled data cannot change estimates obtained with labeled data, and argue that such situations are unusual in practice. We then report on a large set of experiments involving labeled and unlabeled data, and demonstrate that unlabeled data can degrade classification performance when modeling assumptions are incorrect. To improve classification performance, we propose a method to switch assumed model structure based on the effect of unlabeled data.

16 Pages

Back to Index

»Technical Reports

» 2009
» 2008
» 2007
» 2006
» 2005
» 2004
» 2003
» 2002
» 2001
» 2000
» 1990 - 1999

Heritage Technical Reports

» Compaq & DEC Technical Reports
» Tandem Technical Reports
Privacy statement Using this site means you accept its terms Feedback to HP Labs
© 2009 Hewlett-Packard Development Company, L.P.