TY - JOUR
T1 - Sampling dark networks to locate people of interest
AU - Wijegunawardana, Pivithuru
AU - Ojha, Vatsal
AU - Gera, Ralucca
AU - Soundarajan, Sucheta
N1 - Publisher Copyright:
© 2018, This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Dark networks, which describe networks with covert entities and connections such as those representing illegal activities, are of great interest to intelligence analysts. However, before studying such a network, one must first collect appropriate network data. Collecting accurate network data in such a setting is a challenging task, as data collectors will make inferences, which may be incorrect, based on available intelligence data, which may itself be misleading. In this paper, we consider the problem of how to effectively sample dark networks, in which sampling queries may return incorrect information, with the specific goal of locating people of interest. We present RedLearn and RedLearnRS, two algorithms for crawling dark networks with the goal of maximizing the identification of nodes of interest, given a limited sampling budget. RedLearn assumes that a query on a node can accurately return whether a node represents a person of interest, while RedLearnRS dispenses with that assumption. We consider realistic error scenarios, which describe how individuals in a dark network may attempt to conceal their connections. We evaluate and present results on several real-world networks, including dark networks, as well as various synthetic dark network structures proposed in the criminology literature. Our analysis shows that RedLearn and RedLearnRS meet or outperform other sampling strategies.
AB - Dark networks, which describe networks with covert entities and connections such as those representing illegal activities, are of great interest to intelligence analysts. However, before studying such a network, one must first collect appropriate network data. Collecting accurate network data in such a setting is a challenging task, as data collectors will make inferences, which may be incorrect, based on available intelligence data, which may itself be misleading. In this paper, we consider the problem of how to effectively sample dark networks, in which sampling queries may return incorrect information, with the specific goal of locating people of interest. We present RedLearn and RedLearnRS, two algorithms for crawling dark networks with the goal of maximizing the identification of nodes of interest, given a limited sampling budget. RedLearn assumes that a query on a node can accurately return whether a node represents a person of interest, while RedLearnRS dispenses with that assumption. We consider realistic error scenarios, which describe how individuals in a dark network may attempt to conceal their connections. We evaluate and present results on several real-world networks, including dark networks, as well as various synthetic dark network structures proposed in the criminology literature. Our analysis shows that RedLearn and RedLearnRS meet or outperform other sampling strategies.
KW - Dark networks
KW - Lying scenarios
KW - Nodes of interest
KW - Sampling
UR - http://www.scopus.com/inward/record.url?scp=85042869428&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85042869428&partnerID=8YFLogxK
U2 - 10.1007/s13278-018-0487-0
DO - 10.1007/s13278-018-0487-0
M3 - Article
AN - SCOPUS:85042869428
SN - 1869-5450
VL - 8
JO - Social Network Analysis and Mining
JF - Social Network Analysis and Mining
IS - 1
M1 - 15
ER -