This paper explores the skills needed to be a data scientist. Specifically, we report on a mixed method study of a project-based data science class, where we evaluated student effectiveness with respect to dividing a project into appropriately sized modular tasks, which we termed task modularity. Our results suggest that while data science students can appreciate the value of task modularity, they struggle to achieve effective task modularity. As a first step, based our study, we identified six task decomposition best practices. However, these best practices do not fully address this gap of how to enable data science students to effectively use task modularity. We note that while computer science/information system programs typically teach modularity (e.g., the decomposition process and abstraction), and there remains a need identify a corresponding model to that used for computer science / information system students, to teach modularity to data science students.
|Original language||English (US)|
|Title of host publication||Proceedings of the 52nd Hawaii International Conference on System Sciences|
|State||Published - Jan 2019|