TY - JOUR
T1 - Partitioning Communication Streams Into Graph Snapshots
AU - Wendt, Jeremy D.
AU - Field, Richard
AU - Phillips, Cynthia
AU - Prasadan, Arvind
AU - Wilson, Tegan
AU - Soundarajan, Sucheta
AU - Bhowmick, Sanjukta
N1 - Funding Information:
Thework of JeremyD.Wendt, Richard Field, Jr., Cynthia Phillips, TeganWilson, and Arvind Prasadan was supported by the Laboratory Directed Research and Development Program at Sandia National Laboratories, a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy'sNational Nuclear Security Administration under Contract DE-NA0003525. The work of Sucheta Soundarajan was supported by the U.S. Army Research Office under Grant W911NF1810047. The views expressed in this article do not necessarily represent the views of the U.S. Department of Energy or the United States Government. The work of Sanjukta Bhowmick was supported by NSF under Awards 1725566 and 1900765.
Publisher Copyright:
© 2013 IEEE.
PY - 2023/3/1
Y1 - 2023/3/1
N2 - We present EASEE (Edge Advertisements into Snapshots using Evolving Expectations) for partitioning streaming communication data into static graph snapshots. Given streaming communication events (A talks to B), EASEE identifies when events suffice for a static graph (a snapshot). EASEE uses combinatorial statistical models to adaptively find when a snapshot is stable, while watching for significant data shifts - indicating a new snapshot should begin. If snapshots are not found carefully, they poorly represent the underlying data - and downstream graph analytics fail: We show a community detection example. We demonstrate EASEE's strengths against several real-world datasets, and its accuracy against known-answer synthetic datasets. Synthetic datasets' results show that (1) EASEE finds known-answer data shifts very quickly; and (2) ignoring these shifts drastically affects analytics on resulting snapshots. We show that previous work misses these shifts. Further, we evaluate EASEE against seven real-world datasets (330 K to 2.5B events), and find snapshot-over-time behaviors missed by previous works. Finally, we show that the resulting snapshots' measured properties (e.g., graph density) are altered by how snapshots are identified from the communication event stream. In particular, EASEE's snapshots do not generally 'densify' over time, contradicting previous influential results that used simpler partitioning methods.
AB - We present EASEE (Edge Advertisements into Snapshots using Evolving Expectations) for partitioning streaming communication data into static graph snapshots. Given streaming communication events (A talks to B), EASEE identifies when events suffice for a static graph (a snapshot). EASEE uses combinatorial statistical models to adaptively find when a snapshot is stable, while watching for significant data shifts - indicating a new snapshot should begin. If snapshots are not found carefully, they poorly represent the underlying data - and downstream graph analytics fail: We show a community detection example. We demonstrate EASEE's strengths against several real-world datasets, and its accuracy against known-answer synthetic datasets. Synthetic datasets' results show that (1) EASEE finds known-answer data shifts very quickly; and (2) ignoring these shifts drastically affects analytics on resulting snapshots. We show that previous work misses these shifts. Further, we evaluate EASEE against seven real-world datasets (330 K to 2.5B events), and find snapshot-over-time behaviors missed by previous works. Finally, we show that the resulting snapshots' measured properties (e.g., graph density) are altered by how snapshots are identified from the communication event stream. In particular, EASEE's snapshots do not generally 'densify' over time, contradicting previous influential results that used simpler partitioning methods.
KW - Datasets
KW - graph sampling
KW - network evolution
KW - social networks
UR - http://www.scopus.com/inward/record.url?scp=85144025601&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144025601&partnerID=8YFLogxK
U2 - 10.1109/TNSE.2022.3223614
DO - 10.1109/TNSE.2022.3223614
M3 - Article
AN - SCOPUS:85144025601
SN - 2327-4697
VL - 10
SP - 809
EP - 826
JO - IEEE Transactions on Network Science and Engineering
JF - IEEE Transactions on Network Science and Engineering
IS - 2
ER -