SDREGION: fast spotting of changing communities in biological networks
Serene W.H. Wong (University Health Network); Chiara Pastrello (University Health Network); Max Kotlyar (University Health Network); Christos Faloutsos (Carnegie Mellon University); Igor Jurisica (University of Toronto)
Given a large, dynamic graph, how can we trace the activities of groups of vertices over time? Given a dynamic biological graph modeling a given disease progression, which genes interact closely at the early stage of the disease, and their interactions are being disrupted in the latter stage of the disease? Which genes interact sparsely at the early stage of the disease, and their interactions increase as the disease progresses? Knowing the answers to these questions is important as they give insights to the underlying molecular mechanism to disease progression, and potential treatments that target these mechanisms can be developed. There are three main contributions to this paper. First, we designed a novel algorithm, SDREGION, that identifies subgraphs that decrease or increase in density monotonically over time, referred to as d-regions or i-regions, respectively. We introduced the objective function, -density, for identifying d-(i-)regions. Second, SDREGION is a generic algorithm, applicable across several real datasets. In this manuscript, we showed its effectiveness, and made observations in the modeling of the progression of lung cancer. In particular, we observed that SDREGION identified d-(i-)regions that capture mechanisms that align with literature. Importantly, findings that were identified but were not retrospectively validated by literature may provide novel mechanisms in tumor progression that will guide future biological experiments. Third, SDREGION is scalable with a time complexity of O(mlogn + nlogn) where m is the number of edges, and n is the number of vertices in a given dynamic graph.