Metadata in Trustworthy AI: From Data Quality to ML Modeling

Research output: Contribution to journalConference Articlepeer-review

Abstract

Metadata play a significant role in making AI models trustworthy by providing information on input, output, models, pipelines, and other artifacts to meet the requirements for trustworthy AI. This concept paper focuses on what role metadata play in an AI lifecycle and how metadata research can ride out this AI wave with innovative creations. Specifically, we explore metadata's role and potential related to data quality and ML models. The multidimensionality of metadata for data in AI is driving metadata to be micro-specific, embedded in data and models, highly computational, and fast-moving or agile. While there are no universally agreeable metadata schemas for documenting the artifacts in ML model development, there are some common areas or types of metadata for ML models. Data quality and ML models are tightly connected and can impact one another in significant ways. Trustworthy AI must rely on quality data and responsible, ethical, reproducible, verifiable ML models, and the assurance of these data and ML model properties relies on metadata. The complex, fast paced, and highly computational nature of metadata for AI artifacts (datasets, models, pipelines, algorithms, lineages, etc.) is making conventional metadata development processes and methods outdated, but meanwhile has prompted some innovative metadata creations.

Keywords

  • data quality
  • metadata for AI
  • ML model metadata
  • trustworthy AI

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'Metadata in Trustworthy AI: From Data Quality to ML Modeling'. Together they form a unique fingerprint.

Cite this