Phase Transitions in High-Dimensional Pattern Classification

 AUTHOR = "Tad Hogg and J. O. Kephart",
 TITLE = "Phase Transitions in High-Dimensional Pattern Classification",
 JOURNAL = "Computer Systems Science and Engineering",
 VOLUME = "5",
 NUMBER = "4",
 PAGES = "223-232",
 MONTH = "October",
 YEAR = "1990"}


Recent technological advances in computer systems facilitate a wide range of tasks which utilize a great deal of information, either from existing databases or sensory inputs. The difficulties of programming such systems effectively necessitates a fundamental understanding of their behaviour. A theory of such systems is presented for the case of information-intensive categorization tasks. The performance of algorithms used in solving such problems is shown to change abruptly as the number of categorization classes, the variability of the input features, and the algorithm quality are varied. While this abrupt behaviour is not seen in current problems, it is important for larger cases made possible by recent developments. Under the proper conditions, a classification algorithm which performs poorly in small problems can perform extremely well for larger problems. The theory is illustrated by pattern recognition and database retrieval examples.