HP Labs Israel's Open Innovation collaborations brings together researchers and entrepreneurs in academia, government and business, with a focus on co-innovation in region.
The Innovation Research Program offers universities worldwide the opportunity to participate in joint research with leading HP Labs scientists on a competitive basis.
Collaborative analysis of multiple user profiles
Organizations that provide services often have access to the behavior of users over time. Such data may come in the form of user feedback, user actions, user requests etc. Understanding and predicting the behavior of users across time is of key interest to organizations in a wide array of settings. It can serve to analyze user satisfaction with products, diagnose system faults, improve throughput and reduce costs. The goal of this project is to design, ana- lyze and implement novel algorithms for prediction and structure discovery of simultaneous behavior profiles of large numbers of users. We constructed a scheme for using temporal information in user actions. Specifically, we use suffix trees that can represent information in arbitrarily long sequences. We have studied the setting of a fixed set of users, where we have training data in the form of several sequences from each user, as well as the more challenging setting where user labels are not known, and new users keep arriving to the system and therefore the prediction mechanism cannot rely on previous sequences of the user at hand but instead should identify the user type in an online manner.
Learning the Experts for Online Sequence Prediction. Elad Eban, Amir Globerson, Shai Shalev-Schwartz, Aharon Birnbaum. Accepted to ICML 2012.
Mining Massive and Complex Graph Data
Our research goal is to develop novel and scalable methods for integrated mining and analysis of very large, multi-attributed graphs and networks. A configuration management database (CMDB) is used to manage and query the IT infrastructure of an organization. It stores information about the so-called configuration items (CIs) servers, software, running processes, storage systems, printers, routers, etc. As such it can be considered to be a large multi-attributed graph, where the nodes represent the various CIs and the edges represent the connections between the CIs (e.g., the processes on a particular server, along with starting and ending times). A CMDB provides a wealth of information about the largely undocumented IT practices of a large organization, and thus mining the CDMB graph for frequent subgraph patterns can reveal the de facto infrastructure patterns. Once mined, these patterns can be used to either set the default IT policies, or refine them if found unsatisfactory. Thus, the discovery of infrastructure patterns is an important real-world application of subgraph mining in the IT domain.
Pranay Anchuri, Mohammed J. Zaki, Omer Barkol, Ruth Bergman, Yifat Felder, Shahar Golan, Arik Sityon: Infrastructure Pattern Discovery in Configuration Management Databases via Large Sparse Graph Mining. 2011 IEEE 11th International Conference on Data Mining.
Large Scale Data Analysis and Processing via Subspace Clustering
Michael Elad at Israel Institute of Technology and Yacov Hel-Or at Interdisciplinary Center (IDC) Herzliya
We propose novel subspace clustering (SC) techniques for analyzing and processing very large collections of data. SC is an unsupervised learning technique that has gained substantial interest in the past decade, with applications in computer vision, data mining, bioinformatics and more. SC algorithms seek an inner structure in data collections in the form of clusters of subspaces. The key ingredient behind the success of SC-based applications is the union-of-subspaces model: data samples are modeled as drawn from a union of low-dimensional subspaces. In contrast to previous works that provide very good results for small-to-medium scale data collections, we propose to leverage Sparse-Representations (SR) theory to solve this problem for very large (possibly corrupted) data collections with possibly millions of data samples as well as for continuous data streams.
Integrated Print Quality Evaluation
We are conducting research on integrated print quality evaluation (IPQE). The project has evolved into three major thrusts. Development of a framework for accurate simulation of HP Indigo print quality defects in the presence of real customer content. Development of a framework for conduct of psychophysical experiments to assess visibility of print defects in the presence of customer content. Development of a new Masking Mediated Print Defect Visibility Predictor (MMPDVP) that on a pixel-by-pixel basis combines information about defect strength and masking ability of the customer content. The MMPDVP is trained with the data from the subjective experiments.
Approximate Dynamic Programming for Scheduling in Print Service
Graphical Interface Interpretation Using Graph Grammars
Prof. Kang Zhang at University of Texas at Dallas (UTD) and Prof. Jun Kong at North Dakota State University (NDSU)
Recent advances in human-computer interaction technology have made computers and the Internet more accessible than ever. To better serve the end users, typical interaction patterns and their effectiveness need to be captured and systematically analyzed. Distinct from existing approaches, this proposal will develop a robust and scalable approach to extracting interface semantics using graph grammars. We use the state-of-the-art graph grammar technology, i.e. the Spatial Graph Grammar (SGG), to perform semantic grouping and interpretation of the segmented screen objects.
An Integrated Framework for the Knowledge-based and Graph-theoretical Analysis of Time-oriented Event Sequences
Prof. Ron Pinter at Israel Institute of Technology