Abstract
Dark networks, which describe networks with covert entities and connections such as those representing illegal activities, are of great interest to intelligence analysts. However, before studying such a network, one must first collect appropriate network data. Collecting accurate network data in such a setting is a challenging task, as data collectors will make inferences, which may be incorrect, based on available intelligence data, which may itself be misleading. In this paper, we consider the problem of how to effectively sample dark networks, in which sampling queries may return incorrect information, with the specific goal of locating people of interest. We present RedLearn and RedLearnRS, two algorithms for crawling dark networks with the goal of maximizing the identification of nodes of interest, given a limited sampling budget. RedLearn assumes that a query on a node can accurately return whether a node represents a person of interest, while RedLearnRS dispenses with that assumption. We consider realistic error scenarios, which describe how individuals in a dark network may attempt to conceal their connections. We evaluate and present results on several real-world networks, including dark networks, as well as various synthetic dark network structures proposed in the criminology literature. Our analysis shows that RedLearn and RedLearnRS meet or outperform other sampling strategies.
Original language | English (US) |
---|---|
Article number | 15 |
Journal | Social Network Analysis and Mining |
Volume | 8 |
Issue number | 1 |
DOIs | |
State | Published - Dec 1 2018 |
Keywords
- Dark networks
- Lying scenarios
- Nodes of interest
- Sampling
ASJC Scopus subject areas
- Information Systems
- Communication
- Media Technology
- Human-Computer Interaction
- Computer Science Applications