Abstract
Large cyberinfrastructure-enabled data repositories generate massive amounts of metadata, enabling big data analytics to leverage on the intersection of technological and methodological advances in data science for the quantitative study of science. This paper introduces a definition of big metadata in the context of scientific data repositories and discusses the challenges in big metadata analytics due to the messiness, lack of structures suitable for analytics and heterogeneity in such big metadata. A methodological framework is proposed, which contains conceptual and computational workflows intercepting through collaborative documentation. The workflow-based methodological framework promotes transparency and contributes to research reproducibility. The paper also describes the experience and lessons learned from a four-year big metadata project involving all aspects of the workflow-based methodologies. The methodological framework presented in this paper is a timely contribution to the field of scientometrics and the science of science and policy as the potential value of big metadata is drawing more attention from research and policy maker communities.
Original language | English (US) |
---|---|
Pages (from-to) | 36-45 |
Number of pages | 10 |
Journal | Proceedings of the Association for Information Science and Technology |
Volume | 54 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2017 |
Keywords
- big metadata analytics
- methodology
- scientometrics
- workflows
ASJC Scopus subject areas
- General Computer Science
- Library and Information Sciences