Auditing disclosure by relevance ranking

Rakesh Agrawal, Alexandre Evfimievski, Jerry Kiernan, Raja Velu

Research output: Chapter in Book/Entry/PoemConference contribution

6 Scopus citations

Abstract

Numerous widely publicized cases of theft and misuse of private information underscore the need for audit technology to identify the sources of unauthorized disclosure. We present an auditing methodology that ranks potential disclosure sources according to their proximity to the leaked records. Given a sensitive table that contains the disclosed data, our methodology prioritizes by relevance the past queries to the database that could have potentially been used to produce the sensitive table. We provide three conceptually different measures of proximity between the sensitive table and a query result. One measure is inspired by information retrieval in text processing, another is based on statistical record linkage, and the third computes the derivation probability of the sensitive table in a tree-based generative model. We also analyze the characteristics of the three measures and the corresponding ranking algorithms.

Original languageEnglish (US)
Title of host publicationSIGMOD 2007
Subtitle of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data
Pages79-90
Number of pages12
DOIs
StatePublished - 2007
EventSIGMOD 2007: ACM SIGMOD International Conference on Management of Data - Beijing, China
Duration: Jun 12 2007Jun 14 2007

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Other

OtherSIGMOD 2007: ACM SIGMOD International Conference on Management of Data
Country/TerritoryChina
CityBeijing
Period6/12/076/14/07

Keywords

  • Derivation probability
  • Hippocratic database
  • Information retrieval
  • Privacy
  • Record linkage

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Auditing disclosure by relevance ranking'. Together they form a unique fingerprint.

Cite this