Predicted max degree sampling: Sampling in directed networks to maximize node coverage through crawling

Ricky Laishram, Katchaguy Areekijseree, Sucheta Soundarajan

Research output: Chapter in Book/Entry/PoemConference contribution

Abstract

Sampling through crawling is an important research topic in social network analysis. However there is very little existing work on sampling through crawling in directed networks. In this paper we present a new method of sampling a directed network, with the objective of maximizing the node coverage. Our proposed method, Predicted Max Degree (PMD) Sampling, works by predicting which k open nodes are most likely to have the highest number of unobserved neighbors in a particular iteration. These nodes are queried, and the whole process repeats until all the available budget has been used up. We compared PMD against three baseline algorithms with three networks, and saw large improvements vs. baseline sampling algorithms: With a budget of 2000, PMD found 15%, 87.4% and 170.2% more nodes than the closest baseline algorithm in the wiki-Votes, soc-Slashdot and webGoogle networks respectively.

Original languageEnglish (US)
Title of host publicationProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
EditorsRonay Ak, George Karypis, Yinglong Xia, Xiaohua Tony Hu, Philip S. Yu, James Joshi, Lyle Ungar, Ling Liu, Aki-Hiro Sato, Toyotaro Suzumura, Sudarsan Rachuri, Rama Govindaraju, Weijia Xu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4008-4010
Number of pages3
ISBN (Electronic)9781467390040
DOIs
StatePublished - 2016
Event4th IEEE International Conference on Big Data, Big Data 2016 - Washington, United States
Duration: Dec 5 2016Dec 8 2016

Publication series

NameProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016

Other

Other4th IEEE International Conference on Big Data, Big Data 2016
Country/TerritoryUnited States
CityWashington
Period12/5/1612/8/16

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Predicted max degree sampling: Sampling in directed networks to maximize node coverage through crawling'. Together they form a unique fingerprint.

Cite this