Privacy-preserving GWAS analysis on federated genomic datasets

Scott D. Constable, Yuzhe Tang, Shuang Wang, Xiaoqian Jiang, Steve Chapin

Research output: Contribution to journalArticlepeer-review

48 Scopus citations

Abstract

Background: The biomedical community benefits from the increasing availability of genomic data to support meaningful scientific research, e.g., Genome-Wide Association Studies (GWAS). However, high quality GWAS usually requires a large amount of samples, which can grow beyond the capability of a single institution. Federated genomic data analysis holds the promise of enabling cross-institution collaboration for effective GWAS, but it raises concerns about patient privacy and medical information confidentiality (as data are being exchanged across institutional boundaries), which becomes an inhibiting factor for the practical use. Methods: We present a privacy-preserving GWAS framework on federated genomic datasets. Our method is to layer the GWAS computations on top of secure multi-party computation (MPC) systems. This approach allows two parties in a distributed system to mutually perform secure GWAS computations, but without exposing their private data outside. Results: We demonstrate our technique by implementing a framework for minor allele frequency counting and χ 2 statistics calculation, one of typical computations used in GWAS. For efficient prototyping, we use a state-of-the-art MPC framework, i.e., Portable Circuit Format (PCF) [1]. Our experimental results show promise in realizing both efficient and secure cross-institution GWAS computations.

Original languageEnglish (US)
Article numberS2
JournalBMC Medical Informatics and Decision Making
Volume15
Issue number5
DOIs
StatePublished - Dec 21 2015

Keywords

  • GWAS
  • Genomic data privacy protection
  • Secure multi-party computation
  • Statistical analysis

ASJC Scopus subject areas

  • Health Policy
  • Health Informatics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Privacy-preserving GWAS analysis on federated genomic datasets'. Together they form a unique fingerprint.

Cite this