TY - JOUR
T1 - Midst
T2 - An enhanced development environment that improves the maintainability of a data science analysis
AU - Saltz, Jeffrey S.
AU - Heckman, Robert
AU - Crowston, Kevin G
AU - Hegde, Yatish
N1 - Publisher Copyright:
© 2020, SciKA.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020
Y1 - 2020
N2 - With the increasing ability to generate actionable insight from data, the field of data science has seen significant growth. As more teams develop data science solutions, the analytical code they develop will need to be enhanced in the future, by an existing or a new team member. Thus, the importance of being able to easily maintain and enhance the code required for an analysis will increase. However, to date, there has been minimal research on the maintainability of an analysis done by a data science team. To help address this gap, data science maintainability was explored by (1) creating a data science maintainability model, (2) creating a new tool, called MIDST (Modular Interactive Data Science Tool), that aims to improve data science maintainability, and then (3) conducting a mixed method experiment to evaluate MIDST. The new tool aims to improve the ability of a team member to update and rerun an existing data science analysis by providing a visual data flow view of the analysis within an integrated code and computational environment. Via an analysis of the quantitative and qualitative survey results, the experiment found that MIDST does help improve the maintainability of an analysis. Thus, this research demonstrates the importance of enhanced tools to help improve the maintainability of data science projects.
AB - With the increasing ability to generate actionable insight from data, the field of data science has seen significant growth. As more teams develop data science solutions, the analytical code they develop will need to be enhanced in the future, by an existing or a new team member. Thus, the importance of being able to easily maintain and enhance the code required for an analysis will increase. However, to date, there has been minimal research on the maintainability of an analysis done by a data science team. To help address this gap, data science maintainability was explored by (1) creating a data science maintainability model, (2) creating a new tool, called MIDST (Modular Interactive Data Science Tool), that aims to improve data science maintainability, and then (3) conducting a mixed method experiment to evaluate MIDST. The new tool aims to improve the ability of a team member to update and rerun an existing data science analysis by providing a visual data flow view of the analysis within an integrated code and computational environment. Via an analysis of the quantitative and qualitative survey results, the experiment found that MIDST does help improve the maintainability of an analysis. Thus, this research demonstrates the importance of enhanced tools to help improve the maintainability of data science projects.
KW - Data science
KW - Data science development environment
KW - Maintainability
KW - Project management
KW - Visual programming
UR - http://www.scopus.com/inward/record.url?scp=85093967677&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85093967677&partnerID=8YFLogxK
U2 - 10.12821/ijispm080301
DO - 10.12821/ijispm080301
M3 - Article
AN - SCOPUS:85093967677
VL - 8
SP - 5
EP - 22
JO - International Journal of Information Systems and Project Management
JF - International Journal of Information Systems and Project Management
SN - 2182-7796
IS - 3
ER -