TY - GEN
T1 - A hybrid multi-group privacy-preserving approach for building decision trees
AU - Teng, Zhouxuan
AU - Du, Wenliang
N1 - Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2007
Y1 - 2007
N2 - In this paper, we study the privacy-preserving decision tree building problem on vertically partitioned data. We made two contributions. First, we propose a novel hybrid approach, which takes advantage of the strength of the two existing approaches, randomization and the secure multi-party computation (SMC), to balance the accuracy and efficiency constraints. Compared to these two existing approaches, our proposed approach can achieve much better accuracy than randomization approach and much reduced computation cost than SMC approach. We also propose a multi-group scheme that makes it flexible for data miners to control the balance between data mining accuracy and privacy. We partition attributes into groups, and develop a scheme to conduct group-based randomization to achieve better data mining accuracy. We have implemented and evaluated the proposed schemes for the ID3 decision tree algorithm.
AB - In this paper, we study the privacy-preserving decision tree building problem on vertically partitioned data. We made two contributions. First, we propose a novel hybrid approach, which takes advantage of the strength of the two existing approaches, randomization and the secure multi-party computation (SMC), to balance the accuracy and efficiency constraints. Compared to these two existing approaches, our proposed approach can achieve much better accuracy than randomization approach and much reduced computation cost than SMC approach. We also propose a multi-group scheme that makes it flexible for data miners to control the balance between data mining accuracy and privacy. We partition attributes into groups, and develop a scheme to conduct group-based randomization to achieve better data mining accuracy. We have implemented and evaluated the proposed schemes for the ID3 decision tree algorithm.
KW - Privacy
KW - Randomization
KW - SMC
UR - http://www.scopus.com/inward/record.url?scp=38049174606&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38049174606&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-71701-0_30
DO - 10.1007/978-3-540-71701-0_30
M3 - Conference contribution
AN - SCOPUS:38049174606
SN - 9783540717003
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 296
EP - 307
BT - Advances in Knowledge Discovery and Data Mining - 11th Pacific-Asia Conference, PAKDD 2007, Proceedings
PB - Springer Verlag
T2 - 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007
Y2 - 22 May 2007 through 25 May 2007
ER -