Use of local group information to identify communities in networks

Sucheta Soundarajan, John E. Hopcroft

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a "community" truly is, and these disagreements have led to a wide variety of mathematical formulations for describing communities. Often, these mathematical formulations, such as modularity and conductance, have been founded in the general principle that communities, like a G(n, p) graph, are "round," with connections throughout the entire community, and so algorithms were developed to optimize such mathematical measures. More recently, a variety of algorithms have been developed that, rather than expecting connectivity through the entire community, seek out very small groups of well-connected nodes and then connect these groups into larger communities. In this article, we examine seven real networks, each containing external annotation that allows us to identify "annotated communities." A study of these annotated communities gives insight into why the second category of community detection algorithms may be more successful than the first category. We then present a flexible algorithm template that is based on the idea of joining together small sets of nodes. In this template, we first identify very small, tightly connected "subcommunities" of nodes, each corresponding to a single node's "perception" of the network around it. We then create a new network in which each node represents such a subcommunity, and then identify communities in this new network. Because each node can appear in multiple subcommunities, this method allows us to detect overlapping communities.When evaluated on real data, we show that our template outperforms many other state-of-the-art algorithms.

Original languageEnglish (US)
Article number21
JournalACM Transactions on Knowledge Discovery from Data
Volume9
Issue number3
DOIs
StatePublished - Apr 1 2015
Externally publishedYes

Fingerprint

Joining

Keywords

  • Communities
  • Social networks

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Use of local group information to identify communities in networks. / Soundarajan, Sucheta; Hopcroft, John E.

In: ACM Transactions on Knowledge Discovery from Data, Vol. 9, No. 3, 21, 01.04.2015.

Research output: Contribution to journalArticle

@article{aa6c7424d3c14653b1bf339db215b24b,
title = "Use of local group information to identify communities in networks",
abstract = "The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a {"}community{"} truly is, and these disagreements have led to a wide variety of mathematical formulations for describing communities. Often, these mathematical formulations, such as modularity and conductance, have been founded in the general principle that communities, like a G(n, p) graph, are {"}round,{"} with connections throughout the entire community, and so algorithms were developed to optimize such mathematical measures. More recently, a variety of algorithms have been developed that, rather than expecting connectivity through the entire community, seek out very small groups of well-connected nodes and then connect these groups into larger communities. In this article, we examine seven real networks, each containing external annotation that allows us to identify {"}annotated communities.{"} A study of these annotated communities gives insight into why the second category of community detection algorithms may be more successful than the first category. We then present a flexible algorithm template that is based on the idea of joining together small sets of nodes. In this template, we first identify very small, tightly connected {"}subcommunities{"} of nodes, each corresponding to a single node's {"}perception{"} of the network around it. We then create a new network in which each node represents such a subcommunity, and then identify communities in this new network. Because each node can appear in multiple subcommunities, this method allows us to detect overlapping communities.When evaluated on real data, we show that our template outperforms many other state-of-the-art algorithms.",
keywords = "Communities, Social networks",
author = "Sucheta Soundarajan and Hopcroft, {John E.}",
year = "2015",
month = "4",
day = "1",
doi = "10.1145/2700404",
language = "English (US)",
volume = "9",
journal = "ACM Transactions on Knowledge Discovery from Data",
issn = "1556-4681",
publisher = "Association for Computing Machinery (ACM)",
number = "3",

}

TY - JOUR

T1 - Use of local group information to identify communities in networks

AU - Soundarajan, Sucheta

AU - Hopcroft, John E.

PY - 2015/4/1

Y1 - 2015/4/1

N2 - The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a "community" truly is, and these disagreements have led to a wide variety of mathematical formulations for describing communities. Often, these mathematical formulations, such as modularity and conductance, have been founded in the general principle that communities, like a G(n, p) graph, are "round," with connections throughout the entire community, and so algorithms were developed to optimize such mathematical measures. More recently, a variety of algorithms have been developed that, rather than expecting connectivity through the entire community, seek out very small groups of well-connected nodes and then connect these groups into larger communities. In this article, we examine seven real networks, each containing external annotation that allows us to identify "annotated communities." A study of these annotated communities gives insight into why the second category of community detection algorithms may be more successful than the first category. We then present a flexible algorithm template that is based on the idea of joining together small sets of nodes. In this template, we first identify very small, tightly connected "subcommunities" of nodes, each corresponding to a single node's "perception" of the network around it. We then create a new network in which each node represents such a subcommunity, and then identify communities in this new network. Because each node can appear in multiple subcommunities, this method allows us to detect overlapping communities.When evaluated on real data, we show that our template outperforms many other state-of-the-art algorithms.

AB - The recent interest in networks has inspired a broad range of work on algorithms and techniques to characterize, identify, and extract communities from networks. Such efforts are complicated by a lack of consensus on what a "community" truly is, and these disagreements have led to a wide variety of mathematical formulations for describing communities. Often, these mathematical formulations, such as modularity and conductance, have been founded in the general principle that communities, like a G(n, p) graph, are "round," with connections throughout the entire community, and so algorithms were developed to optimize such mathematical measures. More recently, a variety of algorithms have been developed that, rather than expecting connectivity through the entire community, seek out very small groups of well-connected nodes and then connect these groups into larger communities. In this article, we examine seven real networks, each containing external annotation that allows us to identify "annotated communities." A study of these annotated communities gives insight into why the second category of community detection algorithms may be more successful than the first category. We then present a flexible algorithm template that is based on the idea of joining together small sets of nodes. In this template, we first identify very small, tightly connected "subcommunities" of nodes, each corresponding to a single node's "perception" of the network around it. We then create a new network in which each node represents such a subcommunity, and then identify communities in this new network. Because each node can appear in multiple subcommunities, this method allows us to detect overlapping communities.When evaluated on real data, we show that our template outperforms many other state-of-the-art algorithms.

KW - Communities

KW - Social networks

UR - http://www.scopus.com/inward/record.url?scp=84927515534&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84927515534&partnerID=8YFLogxK

U2 - 10.1145/2700404

DO - 10.1145/2700404

M3 - Article

AN - SCOPUS:84927515534

VL - 9

JO - ACM Transactions on Knowledge Discovery from Data

JF - ACM Transactions on Knowledge Discovery from Data

SN - 1556-4681

IS - 3

M1 - 21

ER -