OptRR: Optimizing randomized response schemes for privacy-preserving data mining

Zhengli Huang, Wenliang Du

Research output: Chapter in Book/Entry/PoemConference contribution

50 Scopus citations

Abstract

The randomized response (RR) technique is a promising technique to disguise private categorical data in Privacy-Preserving Data Mining (PPDM). Although a number of RR-based methods have been proposed for various data mining computations, no study has systematically compared them to find optimal RR schemes. The difficulty of comparison lies in the fact that to compare two PPDM schemes, one needs to consider two conflicting metrics: privacy and utility. An optimal scheme based on one metric is usually the worst based on the other metric. In this paper, we first describe a method to quantify privacy and utility. We formulate the quantification as estimate problems, and use estimate theories to derive quantification. We then use an evolutionary multi-objective optimization method to find optimal disguise matrices for the randomized response technique. The experimental results have shown that our scheme has a much better performance than the existing RR schemes.

Original languageEnglish (US)
Title of host publicationProceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
Pages705-714
Number of pages10
DOIs
StatePublished - 2008
Event2008 IEEE 24th International Conference on Data Engineering, ICDE'08 - Cancun, Mexico
Duration: Apr 7 2008Apr 12 2008

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Other

Other2008 IEEE 24th International Conference on Data Engineering, ICDE'08
Country/TerritoryMexico
CityCancun
Period4/7/084/12/08

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'OptRR: Optimizing randomized response schemes for privacy-preserving data mining'. Together they form a unique fingerprint.

Cite this