TY - GEN
T1 - Differential Privacy of Hierarchical Census Data
T2 - 25th International Conference on Principles and Practice of Constraint Programming, CP 2019
AU - Fioretto, Ferdinando
AU - Van Hentenryck, Pascal
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - This paper is motivated by applications of a Census Bureau interested in releasing aggregate socio-economic data about a large population without revealing sensitive information. The released information can be the number of individuals living alone, the number of cars they own, or their salary brackets. Recent events have identified some of the privacy challenges faced by these organizations. To address them, this paper presents a novel differential-privacy mechanism for releasing hierarchical counts of individuals satisfying a given property. The counts are reported at multiple granularities (e.g., the national, state, and county levels) and must be consistent across levels. The core of the mechanism is an optimization model that redistributes the noise introduced to attain privacy in order to meet the consistency constraints between the hierarchical levels. The key technical contribution of the paper shows that this optimization problem can be solved in polynomial time by exploiting the structure of its cost functions. Experimental results on very large, real datasets show that the proposed mechanism provides improvements up to two orders of magnitude in terms of computational efficiency and accuracy with respect to other state-of-the-art techniques.
AB - This paper is motivated by applications of a Census Bureau interested in releasing aggregate socio-economic data about a large population without revealing sensitive information. The released information can be the number of individuals living alone, the number of cars they own, or their salary brackets. Recent events have identified some of the privacy challenges faced by these organizations. To address them, this paper presents a novel differential-privacy mechanism for releasing hierarchical counts of individuals satisfying a given property. The counts are reported at multiple granularities (e.g., the national, state, and county levels) and must be consistent across levels. The core of the mechanism is an optimization model that redistributes the noise introduced to attain privacy in order to meet the consistency constraints between the hierarchical levels. The key technical contribution of the paper shows that this optimization problem can be solved in polynomial time by exploiting the structure of its cost functions. Experimental results on very large, real datasets show that the proposed mechanism provides improvements up to two orders of magnitude in terms of computational efficiency and accuracy with respect to other state-of-the-art techniques.
UR - http://www.scopus.com/inward/record.url?scp=85075730941&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075730941&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-30048-7_37
DO - 10.1007/978-3-030-30048-7_37
M3 - Conference contribution
AN - SCOPUS:85075730941
SN - 9783030300470
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 639
EP - 655
BT - Principles and Practice of Constraint Programming - 25th International Conference, CP 2019, Proceedings
A2 - Schiex, Thomas
A2 - de Givry, Simon
PB - Springer
Y2 - 30 September 2019 through 4 October 2019
ER -