TY - GEN
T1 - NetProtect
T2 - 13th ACM Web Science Conference, WebSci 2021
AU - Laishram, Ricky
AU - Hozhabrierdi, Pegah
AU - Wendt, Jeremy
AU - Soundarajan, Sucheta
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/6/21
Y1 - 2021/6/21
N2 - In many network applications, it may be desirable to conceal certain target nodes from detection by a data collector, who is using a crawling algorithm to explore a network. For example, in a computer network, the network administrator may wish to protect those computers (target nodes) with sensitive information from discovery by a hacker who has exploited vulnerable machines and entered the network. These networks are often protected by hiding the machines (nodes) from external access, and allow only fixed entry points into the system (protection against external attacks). However, in this protection scheme, once one of the entry points is breached, the safety of all internal machines is jeopardized (i.e., the external attack turns into an internal attack). In this paper, we view this problem from the perspective of the data protector. We propose the Node Protection Problem: given a network with known entry points, which edges should be removed/added so as to protect as many target nodes from the data collector as possible? A trivial way to solve this problem would be to simply disconnect either the entry points or the target nodes - but that would make the network non-functional. Accordingly, we impose certain constraints: for each node, only (1 - r) fraction of its edges can be removed, and the resulting network must not be disconnected. We propose two novel scoring mechanisms - the Frequent Path Score and the Shortest Path Score. Using these scores, we propose NetProtect, an algorithm that selects edges to be removed or added so as to best impede the progress of the data collector. We show experimentally that NetProtect outperforms baseline node protection algorithms across several real-world networks. In some datasets, With 1% of the edges removed by NetProtect, we found that the data collector requires up to 6 (4) times the budget compared to the next best baseline in order to discover 5 (50) nodes.
AB - In many network applications, it may be desirable to conceal certain target nodes from detection by a data collector, who is using a crawling algorithm to explore a network. For example, in a computer network, the network administrator may wish to protect those computers (target nodes) with sensitive information from discovery by a hacker who has exploited vulnerable machines and entered the network. These networks are often protected by hiding the machines (nodes) from external access, and allow only fixed entry points into the system (protection against external attacks). However, in this protection scheme, once one of the entry points is breached, the safety of all internal machines is jeopardized (i.e., the external attack turns into an internal attack). In this paper, we view this problem from the perspective of the data protector. We propose the Node Protection Problem: given a network with known entry points, which edges should be removed/added so as to protect as many target nodes from the data collector as possible? A trivial way to solve this problem would be to simply disconnect either the entry points or the target nodes - but that would make the network non-functional. Accordingly, we impose certain constraints: for each node, only (1 - r) fraction of its edges can be removed, and the resulting network must not be disconnected. We propose two novel scoring mechanisms - the Frequent Path Score and the Shortest Path Score. Using these scores, we propose NetProtect, an algorithm that selects edges to be removed or added so as to best impede the progress of the data collector. We show experimentally that NetProtect outperforms baseline node protection algorithms across several real-world networks. In some datasets, With 1% of the edges removed by NetProtect, we found that the data collector requires up to 6 (4) times the budget compared to the next best baseline in order to discover 5 (50) nodes.
KW - adversary
KW - graph
KW - network
KW - perturbation
UR - http://www.scopus.com/inward/record.url?scp=85109033323&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85109033323&partnerID=8YFLogxK
U2 - 10.1145/3447535.3462500
DO - 10.1145/3447535.3462500
M3 - Conference contribution
AN - SCOPUS:85109033323
T3 - ACM International Conference Proceeding Series
SP - 93
EP - 101
BT - WebSci 2021 - Proceedings of the 13th ACM Web Science Conference
PB - Association for Computing Machinery
Y2 - 21 June 2021 through 25 June 2021
ER -