DE-crawler: A densification-expansion algorithm for online data collection

Katchaguy Areekijseree, Sucheta Soundarajan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Over the past two decades, online social networks have attracted a great deal of attention from researchers. However, before one can gain insight into the behavior or structure of a network, one must first collect appropriate data. Data collection poses several challenges, such as API or bandwidth limits, which require the data collector to carefully consider which queries to make. Many network crawling methods have been proposed; however, their performance depends on network structure. In particular, our previous work in [1] has shown that existing algorithms tend to either (1) Do well at exploring dense areas of a network, but have difficulty in transitioning to new areas of the network, or (2) Easily move between network regions, but fail to fully explore each region. In this work, we introduce DE-Crawler, a novel network crawler that attempts to capture the best of both worlds. DE-Crawler consists of two main stages: Densification, in which the crawler aims to find as many nodes as possible in the current dense region (or community), and Expansion, in which the crawler tries to escape from its current region and move to another dense region. We show that DE-Crawler performs well across networks with different structural properties, outperforming baseline algorithms by up to 28%.

Original languageEnglish (US)
Title of host publicationProceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018
EditorsAndrea Tagarelli, Chandan Reddy, Ulrik Brandes
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages164-169
Number of pages6
ISBN (Electronic)9781538660515
DOIs
StatePublished - Oct 24 2018
Event10th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018 - Barcelona, Spain
Duration: Aug 28 2018Aug 31 2018

Other

Other10th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018
CountrySpain
CityBarcelona
Period8/28/188/31/18

Fingerprint

Densification
Application programming interfaces (API)
Structural properties
Bandwidth
Data collection
social network
community
performance

ASJC Scopus subject areas

  • Sociology and Political Science
  • Communication
  • Computer Networks and Communications
  • Information Systems and Management

Cite this

Areekijseree, K., & Soundarajan, S. (2018). DE-crawler: A densification-expansion algorithm for online data collection. In A. Tagarelli, C. Reddy, & U. Brandes (Eds.), Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018 (pp. 164-169). [8508311] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASONAM.2018.8508311

DE-crawler : A densification-expansion algorithm for online data collection. / Areekijseree, Katchaguy; Soundarajan, Sucheta.

Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018. ed. / Andrea Tagarelli; Chandan Reddy; Ulrik Brandes. Institute of Electrical and Electronics Engineers Inc., 2018. p. 164-169 8508311.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Areekijseree, K & Soundarajan, S 2018, DE-crawler: A densification-expansion algorithm for online data collection. in A Tagarelli, C Reddy & U Brandes (eds), Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018., 8508311, Institute of Electrical and Electronics Engineers Inc., pp. 164-169, 10th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018, Barcelona, Spain, 8/28/18. https://doi.org/10.1109/ASONAM.2018.8508311
Areekijseree K, Soundarajan S. DE-crawler: A densification-expansion algorithm for online data collection. In Tagarelli A, Reddy C, Brandes U, editors, Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 164-169. 8508311 https://doi.org/10.1109/ASONAM.2018.8508311
Areekijseree, Katchaguy ; Soundarajan, Sucheta. / DE-crawler : A densification-expansion algorithm for online data collection. Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018. editor / Andrea Tagarelli ; Chandan Reddy ; Ulrik Brandes. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 164-169
@inproceedings{fe575b8ecd184903b8853fb879a7055f,
title = "DE-crawler: A densification-expansion algorithm for online data collection",
abstract = "Over the past two decades, online social networks have attracted a great deal of attention from researchers. However, before one can gain insight into the behavior or structure of a network, one must first collect appropriate data. Data collection poses several challenges, such as API or bandwidth limits, which require the data collector to carefully consider which queries to make. Many network crawling methods have been proposed; however, their performance depends on network structure. In particular, our previous work in [1] has shown that existing algorithms tend to either (1) Do well at exploring dense areas of a network, but have difficulty in transitioning to new areas of the network, or (2) Easily move between network regions, but fail to fully explore each region. In this work, we introduce DE-Crawler, a novel network crawler that attempts to capture the best of both worlds. DE-Crawler consists of two main stages: Densification, in which the crawler aims to find as many nodes as possible in the current dense region (or community), and Expansion, in which the crawler tries to escape from its current region and move to another dense region. We show that DE-Crawler performs well across networks with different structural properties, outperforming baseline algorithms by up to 28{\%}.",
author = "Katchaguy Areekijseree and Sucheta Soundarajan",
year = "2018",
month = "10",
day = "24",
doi = "10.1109/ASONAM.2018.8508311",
language = "English (US)",
pages = "164--169",
editor = "Andrea Tagarelli and Chandan Reddy and Ulrik Brandes",
booktitle = "Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - DE-crawler

T2 - A densification-expansion algorithm for online data collection

AU - Areekijseree, Katchaguy

AU - Soundarajan, Sucheta

PY - 2018/10/24

Y1 - 2018/10/24

N2 - Over the past two decades, online social networks have attracted a great deal of attention from researchers. However, before one can gain insight into the behavior or structure of a network, one must first collect appropriate data. Data collection poses several challenges, such as API or bandwidth limits, which require the data collector to carefully consider which queries to make. Many network crawling methods have been proposed; however, their performance depends on network structure. In particular, our previous work in [1] has shown that existing algorithms tend to either (1) Do well at exploring dense areas of a network, but have difficulty in transitioning to new areas of the network, or (2) Easily move between network regions, but fail to fully explore each region. In this work, we introduce DE-Crawler, a novel network crawler that attempts to capture the best of both worlds. DE-Crawler consists of two main stages: Densification, in which the crawler aims to find as many nodes as possible in the current dense region (or community), and Expansion, in which the crawler tries to escape from its current region and move to another dense region. We show that DE-Crawler performs well across networks with different structural properties, outperforming baseline algorithms by up to 28%.

AB - Over the past two decades, online social networks have attracted a great deal of attention from researchers. However, before one can gain insight into the behavior or structure of a network, one must first collect appropriate data. Data collection poses several challenges, such as API or bandwidth limits, which require the data collector to carefully consider which queries to make. Many network crawling methods have been proposed; however, their performance depends on network structure. In particular, our previous work in [1] has shown that existing algorithms tend to either (1) Do well at exploring dense areas of a network, but have difficulty in transitioning to new areas of the network, or (2) Easily move between network regions, but fail to fully explore each region. In this work, we introduce DE-Crawler, a novel network crawler that attempts to capture the best of both worlds. DE-Crawler consists of two main stages: Densification, in which the crawler aims to find as many nodes as possible in the current dense region (or community), and Expansion, in which the crawler tries to escape from its current region and move to another dense region. We show that DE-Crawler performs well across networks with different structural properties, outperforming baseline algorithms by up to 28%.

UR - http://www.scopus.com/inward/record.url?scp=85057321980&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057321980&partnerID=8YFLogxK

U2 - 10.1109/ASONAM.2018.8508311

DO - 10.1109/ASONAM.2018.8508311

M3 - Conference contribution

AN - SCOPUS:85057321980

SP - 164

EP - 169

BT - Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2018

A2 - Tagarelli, Andrea

A2 - Reddy, Chandan

A2 - Brandes, Ulrik

PB - Institute of Electrical and Electronics Engineers Inc.

ER -