Custom execution environments with containers in pegasus-enabled scientific workflows

Karan Vahi, Michael Zink, Mats Rynge, George Papadimitriou, Duncan Brown, Rajiv Mayani, Rafael Ferreira Da Silva, Ewa Deelman, Anirban Mandal, Eric Lyons

Research output: Chapter in Book/Entry/PoemConference contribution

7 Scopus citations

Abstract

Science reproducibility is a cornerstone feature in scientific workflows. In most cases, this has been implemented as a way to exactly reproduce the computational steps taken to reach the final results. While these steps are often completely described, including the input parameters, datasets, and codes, the environment in which these steps are executed is only described at a higher level with endpoints and operating system name and versions. Though this may be sufficient for reproducibility in the short term, systems evolve and are replaced over time, breaking the underlying workflow reproducibility. A natural solution to this problem is containers, as they are well defined, have a lifetime independent of the underlying system, and can be user-controlled so that they can provide custom environments if needed. This paper highlights some unique challenges that may arise when using containers in distributed scientific workflows. Further, this paper explores how the Pegasus Workflow Management System implements container support to address such challenges.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 15th International Conference on eScience, eScience 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages281-290
Number of pages10
ISBN (Electronic)9781728124513
DOIs
StatePublished - Sep 2019
Event15th IEEE International Conference on eScience, eScience 2019 - San Diego, United States
Duration: Sep 24 2019Sep 27 2019

Publication series

NameProceedings - IEEE 15th International Conference on eScience, eScience 2019

Conference

Conference15th IEEE International Conference on eScience, eScience 2019
Country/TerritoryUnited States
CitySan Diego
Period9/24/199/27/19

Keywords

  • Containers
  • Distributed computing
  • Docker
  • Pegasus
  • Reproducibility
  • Scientific workflows
  • Shifter
  • Singularity

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Ecological Modeling
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Custom execution environments with containers in pegasus-enabled scientific workflows'. Together they form a unique fingerprint.

Cite this