On the approximability of geometric and geographic generalization and the min-max bin covering problem

Wenliang Du, David Eppstein, Michael T. Goodrich, George S. Lueker

Research output: Chapter in Book/Entry/PoemConference contribution

4 Scopus citations

Abstract

We study the problem of abstracting a table of data about individuals so that no selection query can identify fewer than k individuals. We show that it is impossible to achieve arbitrarily good polynomial-time approximations for a number of natural variations of the generalization technique, unless P = NP, even when the table has only a single quasi-identifying attribute that represents a geographic or unordered attribute: - Zip-codes: nodes of a planar graph generalized into connected subgraphs - GPS coordinates: points in R 2 generalized into non-overlapping rectangles - Unordered data: text labels that can be grouped arbitrarily. These hard single-attribute instances of generalization problems contrast with the previously known NP-hard instances, which require the number of attributes to be proportional to the number of individual records (the rows of the table). In addition to impossibility results, we provide approximation algorithms for these difficult single-attribute generalization problems, which, of course, apply to multiple-attribute instances with one that is quasi-identifying. Incidentally, the generalization problem for unordered data can be viewed as a novel type of bin packing problem-min-max bin covering-which may be of independent interest.

Original languageEnglish (US)
Title of host publicationAlgorithms and Data Structures - 11th International Symposium, WADS 2009, Proceedings
Pages242-253
Number of pages12
DOIs
StatePublished - 2009
Event11th International Symposium on Algorithms and Data Structures, WADS 2009 - Banff, AB, Canada
Duration: Aug 21 2009Aug 23 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5664 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other11th International Symposium on Algorithms and Data Structures, WADS 2009
Country/TerritoryCanada
CityBanff, AB
Period8/21/098/23/09

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'On the approximability of geometric and geographic generalization and the min-max bin covering problem'. Together they form a unique fingerprint.

Cite this