TY - GEN
T1 - ϵ-WGX
T2 - 9th ACM Web Science Conference, WebSci 2017
AU - Soundarajan, Sucheta
AU - Gallagher, Brian
AU - Eliassi-Rad, Tina
AU - Pinar, Ali
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/6/25
Y1 - 2017/6/25
N2 - No matter how meticulously constructed, network datasets are often partially observed and incomplete. For example, most of the publicly available data from online social networking services (such as Facebook and Twitter) are collected via apps, users who make their accounts public, and/or the resources available to the researcher/practitioner. Such incompleteness can lead to inaccurate findings. We introduce the Adaptive Edge Probing problem. Suppose that one has observed a networked phenomenon via some form of sampling and has a budget to enhance the incomplete network by asking for additional information about specific nodes, with the ultimate goal of obtaining the most valuable information about the network as a whole. Which nodes should be further explored? We present e-WGX, a network-based explore-exploit algorithm for identifying which nodes in the incomplete network to probe. Aggregated over multiple datasets and a wide range of probing budgets, we find that e-WGX outperforms other explore-exploit strategies and baseline probing strategies. For example, for the task of adding as many nodes as possible, over incomplete networks observed via four popular sampling methods, e-WGX outperforms the best comparison strategy by 12%-23% on average.
AB - No matter how meticulously constructed, network datasets are often partially observed and incomplete. For example, most of the publicly available data from online social networking services (such as Facebook and Twitter) are collected via apps, users who make their accounts public, and/or the resources available to the researcher/practitioner. Such incompleteness can lead to inaccurate findings. We introduce the Adaptive Edge Probing problem. Suppose that one has observed a networked phenomenon via some form of sampling and has a budget to enhance the incomplete network by asking for additional information about specific nodes, with the ultimate goal of obtaining the most valuable information about the network as a whole. Which nodes should be further explored? We present e-WGX, a network-based explore-exploit algorithm for identifying which nodes in the incomplete network to probe. Aggregated over multiple datasets and a wide range of probing budgets, we find that e-WGX outperforms other explore-exploit strategies and baseline probing strategies. For example, for the task of adding as many nodes as possible, over incomplete networks observed via four popular sampling methods, e-WGX outperforms the best comparison strategy by 12%-23% on average.
KW - Adaptive probing
KW - Graph exploration
KW - Incomplete networks
UR - http://www.scopus.com/inward/record.url?scp=85026781099&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85026781099&partnerID=8YFLogxK
U2 - 10.1145/3091478.3091492
DO - 10.1145/3091478.3091492
M3 - Conference contribution
AN - SCOPUS:85026781099
T3 - WebSci 2017 - Proceedings of the 2017 ACM Web Science Conference
SP - 161
EP - 170
BT - WebSci 2017 - Proceedings of the 2017 ACM Web Science Conference
PB - Association for Computing Machinery, Inc
Y2 - 25 June 2017 through 28 June 2017
ER -