Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP

hp.com home

Technical Reports

printable version

HP Labs

» Research
» News and events
» Technical reports
» About HP Labs
» Careers @ HP Labs
» People
» Worldwide sites
» Downloads
Content starts here

Click here for full text: Postscript PDF

pFilter: Global Information Filtering and Dissemination

Tang, Chunqiang; Xu, Zhichen


Keyword(s): No keywords available.

Abstract: Due to the overwhelming amount of information on the Internet, it is becoming increasingly difficult for people to find relevant information in a timely fashion. Information filtering and dissemination systems allow users to register persistent queries called user profiles. They detect new contents, match them against the profiles, and continuously notify users when relevant information becomes available. Existing systems, however, either are not scalable; or do not support matching of unstructured documents. Unstructured documents such as text, HTML or multimedia files, account for a significant percentage of contents on the Internet. To address the limits of the existing systems, we describe pFilter, a global- scale decentralized information filtering and dissemination system for unstructured documents. To handle potentially billions of documents for millions of subscribers, pFilter connects potentially millions of computers in national (and international) computing Grids or ordinary desktops into a structured peer-to- peer overlay network. Nodes in the overlay collectively publish/collect documents, build index, register profiles, and filter and disseminate information. To enable efficient and accurate match between profiles and documents without flooding either documents or profiles, profiles in the overlay are organized around their vector representations (based on modern information retrieval algorithms) such that the searching space of a new document is organized around related profiles. In pFilter, we introduce a new application-level multicast algorithm that allows documents to be efficiently disseminated to a large number of interested parties.

9 Pages

Back to Index

»Technical Reports

» 2009
» 2008
» 2007
» 2006
» 2005
» 2004
» 2003
» 2002
» 2001
» 2000
» 1990 - 1999

Heritage Technical Reports

» Compaq & DEC Technical Reports
» Tandem Technical Reports
Privacy statement Using this site means you accept its terms Feedback to HP Labs
© 2009 Hewlett-Packard Development Company, L.P.