TY - GEN
T1 - Auditing disclosure by relevance ranking
AU - Agrawal, Rakesh
AU - Evfimievski, Alexandre
AU - Kiernan, Jerry
AU - Velu, Raja
PY - 2007
Y1 - 2007
N2 - Numerous widely publicized cases of theft and misuse of private information underscore the need for audit technology to identify the sources of unauthorized disclosure. We present an auditing methodology that ranks potential disclosure sources according to their proximity to the leaked records. Given a sensitive table that contains the disclosed data, our methodology prioritizes by relevance the past queries to the database that could have potentially been used to produce the sensitive table. We provide three conceptually different measures of proximity between the sensitive table and a query result. One measure is inspired by information retrieval in text processing, another is based on statistical record linkage, and the third computes the derivation probability of the sensitive table in a tree-based generative model. We also analyze the characteristics of the three measures and the corresponding ranking algorithms.
AB - Numerous widely publicized cases of theft and misuse of private information underscore the need for audit technology to identify the sources of unauthorized disclosure. We present an auditing methodology that ranks potential disclosure sources according to their proximity to the leaked records. Given a sensitive table that contains the disclosed data, our methodology prioritizes by relevance the past queries to the database that could have potentially been used to produce the sensitive table. We provide three conceptually different measures of proximity between the sensitive table and a query result. One measure is inspired by information retrieval in text processing, another is based on statistical record linkage, and the third computes the derivation probability of the sensitive table in a tree-based generative model. We also analyze the characteristics of the three measures and the corresponding ranking algorithms.
KW - Derivation probability
KW - Hippocratic database
KW - Information retrieval
KW - Privacy
KW - Record linkage
UR - http://www.scopus.com/inward/record.url?scp=35448936516&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35448936516&partnerID=8YFLogxK
U2 - 10.1145/1247480.1247491
DO - 10.1145/1247480.1247491
M3 - Conference contribution
AN - SCOPUS:35448936516
SN - 1595936866
SN - 9781595936868
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 79
EP - 90
BT - SIGMOD 2007
T2 - SIGMOD 2007: ACM SIGMOD International Conference on Management of Data
Y2 - 12 June 2007 through 14 June 2007
ER -