TY - JOUR
T1 - MCS+
T2 - An Efficient Algorithm for Crawling the Community Structure in Multiplex Networks
AU - Laishram, Ricky
AU - Wendt, Jeremy D.
AU - Soundarajan, Sucheta
N1 - Funding Information:
R. Laishram and S. Soundarajan are supported by the U. S. Army Research Office under grant number #W911NF1810047. J. D. Wendt’s work is supported by the Laboratory Directed Research and Development program at Sandia National Laboratories, a multi-mission laboratory managed and operated by the National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energys National Nuclear Security Administration under contract DE-NA0003525. S. Soundarajan was additionally supported by NSF award 1908048. This research was supported in part through computational resources provided by Syracuse University. Authors’ addresses: R. Laishram, Syracuse University, Syracuse, NY 13210; email: rlaishra@syr.edu; J. D. Wendt, Sandia National Laboratories, Albuquerque, NM 87123; email: jdwendt@sandia.edu; S. Soundarajan, Syracuse University, Syracuse, NY 13244; email: susounda@syr.edu. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor, or affiliate of the United States government. As such, the United States government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for government purposes only. © 2021 Association for Computing Machinery. 1556-4681/2021/06-ART11 $15.00 https://doi.org/10.1145/3451527
Publisher Copyright:
© 2021 Association for Computing Machinery.
PY - 2021/7
Y1 - 2021/7
N2 - In this article, we consider the problem of crawling a multiplex network to identify the community structure of a layer-of-interest. A multiplex network is one where there are multiple types of relationships between the nodes. In many multiplex networks, some layers might be easier to explore (in terms of time, money etc.). We propose MCS+, an algorithm that can use the information from the easier to explore layers to help in the exploration of a layer-of-interest that is expensive to explore. We consider the goal of exploration to be generating a sample that is representative of the communities in the complete layer-of-interest. This work has practical applications in areas such as exploration of dark (e.g., criminal) networks, online social networks, biological networks, and so on. For example, in a terrorist network, relationships such as phone records, e-mail records, and so on are easier to collect; in contrast, data on the face-To-face communications are much harder to collect, but also potentially more valuable. We perform extensive experimental evaluations on real-world networks, and we observe that MCS+ consistently outperforms the best baseline-the similarity of the sample that MCS+ generates to the real network is up to three times that of the best baseline in some networks. We also perform theoretical and experimental evaluations on the scalability of MCS+ to network properties, and find that it scales well with the budget, number of layers in the multiplex network, and the average degree in the original network.
AB - In this article, we consider the problem of crawling a multiplex network to identify the community structure of a layer-of-interest. A multiplex network is one where there are multiple types of relationships between the nodes. In many multiplex networks, some layers might be easier to explore (in terms of time, money etc.). We propose MCS+, an algorithm that can use the information from the easier to explore layers to help in the exploration of a layer-of-interest that is expensive to explore. We consider the goal of exploration to be generating a sample that is representative of the communities in the complete layer-of-interest. This work has practical applications in areas such as exploration of dark (e.g., criminal) networks, online social networks, biological networks, and so on. For example, in a terrorist network, relationships such as phone records, e-mail records, and so on are easier to collect; in contrast, data on the face-To-face communications are much harder to collect, but also potentially more valuable. We perform extensive experimental evaluations on real-world networks, and we observe that MCS+ consistently outperforms the best baseline-the similarity of the sample that MCS+ generates to the real network is up to three times that of the best baseline in some networks. We also perform theoretical and experimental evaluations on the scalability of MCS+ to network properties, and find that it scales well with the budget, number of layers in the multiplex network, and the average degree in the original network.
KW - Multplex networks
KW - community detection
KW - multi-Armed bandit
KW - network crawling
UR - http://www.scopus.com/inward/record.url?scp=85111129015&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111129015&partnerID=8YFLogxK
U2 - 10.1145/3451527
DO - 10.1145/3451527
M3 - Article
AN - SCOPUS:85111129015
SN - 1556-4681
VL - 16
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
IS - 1
M1 - 3451527
ER -