Approximate Graph Mining with Label CostsShare
- Author(s): Anchuri, Pranay; Zaki, Mohammed J.; Barkol, Omer; Golan, Shahar; Shamy, Moshe
- HP Laboratories
- Keyword(s): data mining; graph analysis; cmdb; approximation techniques
Abstract: Many real-world graphs have complex labels on the nodes and edges. Mining only exact patterns yields limited insights, since it may be hard to find exact matches. However, in many domains it is relatively easy to compute some cost (or distance) between different labels. Using this information, it becomes possible to mine a much richer set of approximate subgraph patterns, which preserve the topol- ogy but allow bounded label mismatches. We present novel and scalable methods to efficiently solve the approximate isomorphism problem. We show that the mined approximate patterns yield interesting patterns in several real-world graphs ranging from IT and protein interaction networks to protein structures.
- External Posting Date: June 6, 2013 [Fulltext]. Approved for External Publication
- Internal Posting Date: June 6, 2013 [Fulltext]