TY - JOUR
T1 - Use of local group information to identify communities in networks
AU - Soundarajan, Sucheta
AU - Hopcroft, John E.
PY - 2015/4/1
Y1 - 2015/4/1
N2 - The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a "community" truly is, and these disagreements have led to a wide variety of mathematical formulations for describing communities. Often, these mathematical formulations, such as modularity and conductance, have been founded in the general principle that communities, like a G(n, p) graph, are "round," with connections throughout the entire community, and so algorithms were developed to optimize such mathematical measures. More recently, a variety of algorithms have been developed that, rather than expecting connectivity through the entire community, seek out very small groups of well-connected nodes and then connect these groups into larger communities. In this article, we examine seven real networks, each containing external annotation that allows us to identify "annotated communities." A study of these annotated communities gives insight into why the second category of community detection algorithms may be more successful than the first category. We then present a flexible algorithm template that is based on the idea of joining together small sets of nodes. In this template, we first identify very small, tightly connected "subcommunities" of nodes, each corresponding to a single node's "perception" of the network around it. We then create a new network in which each node represents such a subcommunity, and then identify communities in this new network. Because each node can appear in multiple subcommunities, this method allows us to detect overlapping communities.When evaluated on real data, we show that our template outperforms many other state-of-the-art algorithms.
AB - The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a "community" truly is, and these disagreements have led to a wide variety of mathematical formulations for describing communities. Often, these mathematical formulations, such as modularity and conductance, have been founded in the general principle that communities, like a G(n, p) graph, are "round," with connections throughout the entire community, and so algorithms were developed to optimize such mathematical measures. More recently, a variety of algorithms have been developed that, rather than expecting connectivity through the entire community, seek out very small groups of well-connected nodes and then connect these groups into larger communities. In this article, we examine seven real networks, each containing external annotation that allows us to identify "annotated communities." A study of these annotated communities gives insight into why the second category of community detection algorithms may be more successful than the first category. We then present a flexible algorithm template that is based on the idea of joining together small sets of nodes. In this template, we first identify very small, tightly connected "subcommunities" of nodes, each corresponding to a single node's "perception" of the network around it. We then create a new network in which each node represents such a subcommunity, and then identify communities in this new network. Because each node can appear in multiple subcommunities, this method allows us to detect overlapping communities.When evaluated on real data, we show that our template outperforms many other state-of-the-art algorithms.
KW - Communities
KW - Social networks
UR - http://www.scopus.com/inward/record.url?scp=84927515534&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84927515534&partnerID=8YFLogxK
U2 - 10.1145/2700404
DO - 10.1145/2700404
M3 - Article
AN - SCOPUS:84927515534
SN - 1556-4681
VL - 9
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
IS - 3
M1 - 21
ER -