Max-node sampling: An expansion-densification algorithm for data collection

Katchaguy Areekijseree, Ricky Laishram, Sucheta Soundarajan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this work, we propose Max-Node sampling, a novel sampling algorithm for data collection. The goal of Max-Node is to maximize the number of nodes observed in the sample, given a budget constraint. Max-Node is based on the intuition that networks contain many densely connected regions (i.e., communities), that may be only weakly connected to another, and to maximize the number of nodes observed, it is critical to transition between communities. The two key phases of our algorithm are Expansion and Densification. The goal of the Expansion phase is to transition to unobserved regions, while the Densification phase aims to collect as many nodes in the current community. We conduct experiments on several real networks, and show an improvement of up to 40% vs. the baselines.

Original languageEnglish (US)
Title of host publicationProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3944-3946
Number of pages3
ISBN (Electronic)9781467390040
DOIs
StatePublished - Feb 2 2017
Event4th IEEE International Conference on Big Data, Big Data 2016 - Washington, United States
Duration: Dec 5 2016Dec 8 2016

Other

Other4th IEEE International Conference on Big Data, Big Data 2016
CountryUnited States
CityWashington
Period12/5/1612/8/16

Fingerprint

Densification
Sampling
Experiments

Keywords

  • Algorithms
  • Complex Network
  • Data Collection
  • Data Crawling
  • Large Graph
  • Network Sampling

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Hardware and Architecture

Cite this

Areekijseree, K., Laishram, R., & Soundarajan, S. (2017). Max-node sampling: An expansion-densification algorithm for data collection. In Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016 (pp. 3944-3946). [7841070] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2016.7841070

Max-node sampling : An expansion-densification algorithm for data collection. / Areekijseree, Katchaguy; Laishram, Ricky; Soundarajan, Sucheta.

Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. Institute of Electrical and Electronics Engineers Inc., 2017. p. 3944-3946 7841070.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Areekijseree, K, Laishram, R & Soundarajan, S 2017, Max-node sampling: An expansion-densification algorithm for data collection. in Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016., 7841070, Institute of Electrical and Electronics Engineers Inc., pp. 3944-3946, 4th IEEE International Conference on Big Data, Big Data 2016, Washington, United States, 12/5/16. https://doi.org/10.1109/BigData.2016.7841070
Areekijseree K, Laishram R, Soundarajan S. Max-node sampling: An expansion-densification algorithm for data collection. In Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. Institute of Electrical and Electronics Engineers Inc. 2017. p. 3944-3946. 7841070 https://doi.org/10.1109/BigData.2016.7841070
Areekijseree, Katchaguy ; Laishram, Ricky ; Soundarajan, Sucheta. / Max-node sampling : An expansion-densification algorithm for data collection. Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 3944-3946
@inproceedings{80fa87a3ba97446d9d37b46f21a3a563,
title = "Max-node sampling: An expansion-densification algorithm for data collection",
abstract = "In this work, we propose Max-Node sampling, a novel sampling algorithm for data collection. The goal of Max-Node is to maximize the number of nodes observed in the sample, given a budget constraint. Max-Node is based on the intuition that networks contain many densely connected regions (i.e., communities), that may be only weakly connected to another, and to maximize the number of nodes observed, it is critical to transition between communities. The two key phases of our algorithm are Expansion and Densification. The goal of the Expansion phase is to transition to unobserved regions, while the Densification phase aims to collect as many nodes in the current community. We conduct experiments on several real networks, and show an improvement of up to 40{\%} vs. the baselines.",
keywords = "Algorithms, Complex Network, Data Collection, Data Crawling, Large Graph, Network Sampling",
author = "Katchaguy Areekijseree and Ricky Laishram and Sucheta Soundarajan",
year = "2017",
month = "2",
day = "2",
doi = "10.1109/BigData.2016.7841070",
language = "English (US)",
pages = "3944--3946",
booktitle = "Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Max-node sampling

T2 - An expansion-densification algorithm for data collection

AU - Areekijseree, Katchaguy

AU - Laishram, Ricky

AU - Soundarajan, Sucheta

PY - 2017/2/2

Y1 - 2017/2/2

N2 - In this work, we propose Max-Node sampling, a novel sampling algorithm for data collection. The goal of Max-Node is to maximize the number of nodes observed in the sample, given a budget constraint. Max-Node is based on the intuition that networks contain many densely connected regions (i.e., communities), that may be only weakly connected to another, and to maximize the number of nodes observed, it is critical to transition between communities. The two key phases of our algorithm are Expansion and Densification. The goal of the Expansion phase is to transition to unobserved regions, while the Densification phase aims to collect as many nodes in the current community. We conduct experiments on several real networks, and show an improvement of up to 40% vs. the baselines.

AB - In this work, we propose Max-Node sampling, a novel sampling algorithm for data collection. The goal of Max-Node is to maximize the number of nodes observed in the sample, given a budget constraint. Max-Node is based on the intuition that networks contain many densely connected regions (i.e., communities), that may be only weakly connected to another, and to maximize the number of nodes observed, it is critical to transition between communities. The two key phases of our algorithm are Expansion and Densification. The goal of the Expansion phase is to transition to unobserved regions, while the Densification phase aims to collect as many nodes in the current community. We conduct experiments on several real networks, and show an improvement of up to 40% vs. the baselines.

KW - Algorithms

KW - Complex Network

KW - Data Collection

KW - Data Crawling

KW - Large Graph

KW - Network Sampling

UR - http://www.scopus.com/inward/record.url?scp=85015173935&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85015173935&partnerID=8YFLogxK

U2 - 10.1109/BigData.2016.7841070

DO - 10.1109/BigData.2016.7841070

M3 - Conference contribution

AN - SCOPUS:85015173935

SP - 3944

EP - 3946

BT - Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -