Scalable Pattern Matching over Compressed Graphs via Dedensification
Antonio Maccioni*, Roma Tre University; Daniel Abadi, Yale University
One of the most common operations on graph databases is graph pattern matching (e.g., graph isomorphism and more general types of “subgraph pattern matching”). In fact, in some graph query languages every single query is expressed as a graph matching operation. Consequently, there has been a signiﬁcant amount of research eﬀort in optimizing graph matching operations in graph database systems. As graph databases have scaled in recent years, so too has recent work on scaling graph matching operations. However, the performance of recent proposals for scaling graph pattern matching is limited by the presence of high-degree nodes. These high-degree nodes result in an explosion of intermediate result sizes during query execution, and therefore signiﬁcant performance bottlenecks. In this paper we present a de-densiﬁcation technique that losslessly compresses the neighborhood around high-degree nodes. Furthermore, we introduce a query processing technique that enables direct operation of graph query processing operations over the com-pressed data, without ever having to decompress the data. For pattern matching operations, we show how this technique can be implemented as a layer above existing graph database systems, so that the end-user can beneﬁt from this technique without requiring modiﬁcations to the core graph database engine code. Our technique reduces the size of the intermediate result sets during query processing, and thereby improves query performance.
Filed under: Big Data