Bipartite Network Community Detection: Algorithms and Applications
Pesántez Cabrera, Paola Gabriela
MetadataShow full item record
Methods to eﬃciently uncover and extract community structures are required in a vast number of applications where networked data and their interactions can be modeled as graphs, and observing tightly-knit groups of vertices ("communities") can oﬀer insights into the structural and functional building blocks of the underlying network. Classical applications of community detection have largely focused on unipartite networks—i.e., graphs built out of a single type of objects. However, due to increased availability of data from various sources, there is now an increasing need for handling heterogeneous networks which are built out of multiple types of objects. In this dissertation, we address the problem of identifying communities from bipartite networks—i.e., networks where interactions are observed between two diﬀerent types of objects, with special interest in meaningful biological and ecological networks (e.g., genes and diseases, drugs and protein complexes, plants and pollinators, hosts and pathogens). Toward detecting communities in such bipartite networks, we make the following contributions: i) (metrics) we propose a variant of bipartite modularity called Murata+ and we extend this variant to manage not just inter-type, but also intra-type edge information of the network; ii) (algorithms) we present an eﬃcient algorithm called biLouvain that implements a set of heuristics toward fast and precise community detection in large bipartite networks; and iii) (experiments) we present a thorough experimental evaluation of our algorithm including comparison to other state-of-the-art methods to identify communities in bipartite networks. Experimental results show that our biLouvain algorithm identiﬁes robust community structures that have a comparable or better quality (as measured by bipartite modularity) than existing methods, while signiﬁcantly reducing the time-to-solution between one and four orders of magnitude. The implementation of our algorithm and heuristics is publicly available as open source at https://github.com/paolapesantez/biLouvain.