George Forman: External Publications

Research areas:
- Data Mining, Machine Learning, Text Classification
- Knowledge Management, Model-Based Reasoning, Diagnosis
- Parallel & Distributed Computing, Clustering
- Mobile Computing, Variable Resources
It's hard to keep this kind of page
up-to-date. If you're looking for something more recent than I have listed
here, try the list of publications collected for me by
DB Trier or
CiteSeer or search HP Technical
Reports. Note to HP personnel: see also HP-internal tech reports.
I only list HP-external publications here.
-
Classifying with Temporal Inductive Transfer for Recurrent Concept Drift.
G. Forman. NIPS'05
workshop: Inductive Transfer: 10 Years Later. HPL-2005-198.
- Feature
Selection: We've barely scratched the surface. G. Forman.
Essay requested for IEEE
Intelligent Systems, Trends and Controversies department.
HPL-2005-165.
- Counting Positives Accurately Despite Inaccurate Classification. G. Forman.
ECML'05. HPL-2005-96R1
- Beware the Null Hypothesis:
Critical Value Tables for Evaluating Classifiers. G. Forman &
Ira Cohen. ECML'05. HPL-2005-70.
- Finding
Similar Files in Large Document Repositories. G. Forman, K.
Eshghi, and S. Chiocchetti.
KDD'05. HPL-2005-42R1.
- Learning
from Little: Comparison of Classifiers Given Little Training
G. Forman & Ira Cohen. ECML'04. HPL-2004-19R1.
- A Pitfall and Solution in Multi-Class Feature
Selection for Text Classification. G. Forman.
ICML'04.
HPL-2004-86. SpreadFx/Round-Robin method.
- Feature Engineering for a Gene Regulation
Prediction Task. G. Forman. KDD Explorations, 4(2), 2003.
HPL-2002-318. This was an invited paper for getting honorable mention in
the 2002 KDD Data mining Cup competition.
- An
Extensive Empirical Study of Feature Selection Metrics for Text
Classification. G. Forman. Special Issue on Variable
and Feature Selection, Journal
of Machine Learning Research, 3(Mar):1289-1305, 2003. HPL-2002-147R1.
- Incremental
Machine Learning to Reduce Biochemistry Lab Costs in the Search for Drug
Discovery. G. Forman. Data
Mining in Bioinformatics Workshop, 8th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (KDD'02),
July 2002. HPL-2002-141.
- A
Method for Discovering the Insignificance of One's Best Classifier and the
Unlearnability of a Classification Task. G. Forman.
Data
Mining Lessons Learned Workshop, 19th
International Conference on Machine Learning (ICML), Sydney, Australia,
July 8-12, 2002. HPL-2002-123R2.
- Choose
Your Words Carefully: An Empirical Study of Feature Selection Metrics for
Text Classification. G. Forman. In the Joint Proceedings of the 13th
European Conference on Machine Learning and the 6th European Conference on
Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD
'02), August 19-23, 2002. HPL-2002-88R2. This paper gives a new analysis method for comparison studies which focuses
on which method or pair of methods is most likely to give the best result on
a single dataset-- a different perspective than existing machine learning
papers that focus on average results over many of datasets. It also lead to
the discovery of a new feature selection metric that is superior with
respect to accuracy, recall and precision: Bi-Normal Separation.
- Distributed Clustering can be Accurate and
Efficient. G. Forman, B. Zhang. ACM KDD Explorations special issue on Scalable Data Mining
Algorithms, January 2001. HPL-2000-158.
- Accurate Recasting of Parameter Estimation Algorithms using Sufficient Statistics for Efficient Parallel Speed-up Demonstrated for Center-Based Data Clustering
Algorithms. B. Zhang, M. Hsu, G. Forman. 4th European Conference on Principles and Practice of Knowledge Discovery in Databases,
pp. 243-254, Lyon, France, September 13-16, 2000. HPL-2000-94.
- Linear Speed-Up for a Parallel Non-Approximate Recasting of Center-Based Clustering Algorithms, including K-Means, K-Harmonic Means, and
EM. G. Forman, B. Zhang. ACM SIGKDD Workshop on Distributed and Parallel Knowledge Discovery, KDD-2000, Boston, MA, August 20, 2000. HPL-2000-93.
-
A Method based on Genetic Programming for Learning Text Classifiers Applied in the Domain of Spam E-mail
Filtering. G. Forman, M. Hopkins, E. Reeber, J. Suermondt.
Submission to ACM KDD Explorations. HPL-2000-140.
- Practical Optimization Criteria for
Diagnostic Knowledge Representation. P. Cornwell, J. Suermondt, G.
Forman, E. Kirshenbaum, A. Seetharaman. AI in Equipment Maintenance
Service and Support, AAAI Spring Symposium, Stanford, CA, March 1999.
- Automated
End-To-End System Diagnosis of Networked Printing Services Using Model-Based
Reasoning. G. Forman, M. Jain, M. Mansouri-Samani, J. Martinka, A.
Snoeren. Distributed Systems: Operation &
Management, October 1998. HPL-98-41R1.
- Wanted: Programming Support for Ensuring Responsiveness
Despite Resource Variability and Volatility. Workshop on Computing and
Communication in the Presence of Mobility, April 1998. HPL-98-15. Also
appears in
"Mobility: Processes, Computers, and Agents," eds. F. Douglis, D.
Milojicic, R. Wheeler, Addison-Wesley, 1999.
The topic of an invited panel presentation at the 20th International Conference on Software Engineering (ICSE), April 1998.
- Dissertation: Obtaining Responsiveness in Resource-Variable
Environments. Computer Science & Engineering Dept., Univ. of Washington, 1996.
- Survey: The
Challenges of Mobile Computing. G. Forman, J. Zahorjan. IEEE Computer,
27(4):38-47, April 1994. Also appears in
"Mobility: Processes, Computers, and Agents," eds. F. Douglis, D.
Milojicic, R. Wheeler, Addison- Wesley, 1999.
-
ZPL vs.
HPF: A
Comparison of Performance and Programming Style. C. Lin, L. Snyder,
R. Anderson, B. Chamberlain, S. Choi, G. Forman, E. Lewis, W. Weathersby.
Tech Report UW-CSE-95-11-05, Department of Computer Science and Engineering,
University of Washington, 1994.
- The Ariadne Debugger: Scalable Application of Event-Based Abstraction.
J. Cuny, G. Forman, A. Hough, J. Kundu, C. Lin, L. Snyder, D. Stemple. ACM/ONR Workshop on
Parallel and Distributed Debugging, San Diego, CA, May 1993. SIGPLAN
Notices, 28(12): 85-95, Dec. 1993.
- A Distributed Operating System for the K2 Based on Amoeba.
Tech Report 89/14, Swiss
Federal Institute of Technology, Zürich, Switzerland, 1989.
George Forman,
click to reveal email address@hpl.hp.com
Spam robots may
prefer to send to gforman4@hpl.hp.com.