TY - GEN
T1 - Measuring the sampling robustness of complex networks
AU - Areekijseree, Katchaguy
AU - Soundarajan, Sucheta
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/8/27
Y1 - 2019/8/27
N2 - When studying a network, it is often of interest to understand the robustness of that network to noise. Network robustness has been studied in a variety of contexts, examining network properties such as the number of connected components and the lengths of shortest paths. In this work, we present a new network robustness measure, which we refer to as ‘sampling robustness’. The goal of the sampling robustness measure is to quantify the extent to which a network sample collected from a graph with errors is a good representation of a network sample collected from that same graph, but without errors. These errors may be introduced by humans or by the system (e.g., mistakes from the respondents or a bug in an API program), and may affect the performance of a data collection algorithm and the quality of the obtained sample. Thus, when data analysts analyze the sampled network, they may wish to know whether such errors will affect future analysis results. We demonstrate that sampling robustness is dependent on a few, easily-computed properties of the network: the leading eigenvalue, average node degree and clustering coefficient. In addition, we introduce regression models for estimating sampling robustness given an obtained sample. As a result, our models can estimate the sampling robustness with MSE < 0.0015 and the model has an R-squared of up to 75%.
AB - When studying a network, it is often of interest to understand the robustness of that network to noise. Network robustness has been studied in a variety of contexts, examining network properties such as the number of connected components and the lengths of shortest paths. In this work, we present a new network robustness measure, which we refer to as ‘sampling robustness’. The goal of the sampling robustness measure is to quantify the extent to which a network sample collected from a graph with errors is a good representation of a network sample collected from that same graph, but without errors. These errors may be introduced by humans or by the system (e.g., mistakes from the respondents or a bug in an API program), and may affect the performance of a data collection algorithm and the quality of the obtained sample. Thus, when data analysts analyze the sampled network, they may wish to know whether such errors will affect future analysis results. We demonstrate that sampling robustness is dependent on a few, easily-computed properties of the network: the leading eigenvalue, average node degree and clustering coefficient. In addition, we introduce regression models for estimating sampling robustness given an obtained sample. As a result, our models can estimate the sampling robustness with MSE < 0.0015 and the model has an R-squared of up to 75%.
UR - http://www.scopus.com/inward/record.url?scp=85078826314&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078826314&partnerID=8YFLogxK
U2 - 10.1145/3341161.3342873
DO - 10.1145/3341161.3342873
M3 - Conference contribution
AN - SCOPUS:85078826314
T3 - Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019
SP - 294
EP - 301
BT - Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019
A2 - Spezzano, Francesca
A2 - Chen, Wei
A2 - Xiao, Xiaokui
PB - Association for Computing Machinery, Inc
T2 - 11th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019
Y2 - 27 August 2019 through 30 August 2019
ER -