Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Domain knowledge is very important to support the development of analytic models. However, in today's data science projects, domain knowledge is typically documented, but not captured and integrated with the actual analytic model. This raises problems in interoperability and traceability of the relevant domain knowledge that is used to develop an analytic model. To address this challenge, this paper proposes a Knowledge Enriched Analytic Model (KEAM) to enrich analytic models with domain knowledge. To explore the proposed methodology and its benefits, a case study explores the utilization of KEAM to support the development of a Bayesian Network model within the smart manufacturing domain. The case study shows that the efficiency in developing an analytic model is improved by using the proposed KEAM.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
EditorsYang Song, Bing Liu, Kisung Lee, Naoki Abe, Calton Pu, Mu Qiao, Nesreen Ahmed, Donald Kossmann, Jeffrey Saltz, Jiliang Tang, Jingrui He, Huan Liu, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2828-2837
Number of pages10
ISBN (Electronic)9781538650356
DOIs
StatePublished - Jan 22 2019
Event2018 IEEE International Conference on Big Data, Big Data 2018 - Seattle, United States
Duration: Dec 10 2018Dec 13 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018

Conference

Conference2018 IEEE International Conference on Big Data, Big Data 2018
CountryUnited States
CitySeattle
Period12/10/1812/13/18

Fingerprint

Analytical models
Bayesian networks
Interoperability

Keywords

  • analytic model
  • interoperability
  • knowledge
  • smart manufacturing
  • traceability

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems

Cite this

Zhang, H., Roy, U., & Saltz, J. (2019). Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge. In Y. Song, B. Liu, K. Lee, N. Abe, C. Pu, M. Qiao, N. Ahmed, D. Kossmann, J. Saltz, J. Tang, J. He, H. Liu, ... X. Hu (Eds.), Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018 (pp. 2828-2837). [8622364] (Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2018.8622364

Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge. / Zhang, Heng; Roy, Utpal; Saltz, Jeffrey.

Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018. ed. / Yang Song; Bing Liu; Kisung Lee; Naoki Abe; Calton Pu; Mu Qiao; Nesreen Ahmed; Donald Kossmann; Jeffrey Saltz; Jiliang Tang; Jingrui He; Huan Liu; Xiaohua Hu. Institute of Electrical and Electronics Engineers Inc., 2019. p. 2828-2837 8622364 (Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, H, Roy, U & Saltz, J 2019, Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge. in Y Song, B Liu, K Lee, N Abe, C Pu, M Qiao, N Ahmed, D Kossmann, J Saltz, J Tang, J He, H Liu & X Hu (eds), Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018., 8622364, Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018, Institute of Electrical and Electronics Engineers Inc., pp. 2828-2837, 2018 IEEE International Conference on Big Data, Big Data 2018, Seattle, United States, 12/10/18. https://doi.org/10.1109/BigData.2018.8622364
Zhang H, Roy U, Saltz J. Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge. In Song Y, Liu B, Lee K, Abe N, Pu C, Qiao M, Ahmed N, Kossmann D, Saltz J, Tang J, He J, Liu H, Hu X, editors, Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018. Institute of Electrical and Electronics Engineers Inc. 2019. p. 2828-2837. 8622364. (Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018). https://doi.org/10.1109/BigData.2018.8622364
Zhang, Heng ; Roy, Utpal ; Saltz, Jeffrey. / Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge. Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018. editor / Yang Song ; Bing Liu ; Kisung Lee ; Naoki Abe ; Calton Pu ; Mu Qiao ; Nesreen Ahmed ; Donald Kossmann ; Jeffrey Saltz ; Jiliang Tang ; Jingrui He ; Huan Liu ; Xiaohua Hu. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 2828-2837 (Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018).
@inproceedings{ef73abb3479e484cbd3aa38241d3edad,
title = "Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge",
abstract = "Domain knowledge is very important to support the development of analytic models. However, in today's data science projects, domain knowledge is typically documented, but not captured and integrated with the actual analytic model. This raises problems in interoperability and traceability of the relevant domain knowledge that is used to develop an analytic model. To address this challenge, this paper proposes a Knowledge Enriched Analytic Model (KEAM) to enrich analytic models with domain knowledge. To explore the proposed methodology and its benefits, a case study explores the utilization of KEAM to support the development of a Bayesian Network model within the smart manufacturing domain. The case study shows that the efficiency in developing an analytic model is improved by using the proposed KEAM.",
keywords = "analytic model, interoperability, knowledge, smart manufacturing, traceability",
author = "Heng Zhang and Utpal Roy and Jeffrey Saltz",
year = "2019",
month = "1",
day = "22",
doi = "10.1109/BigData.2018.8622364",
language = "English (US)",
series = "Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "2828--2837",
editor = "Yang Song and Bing Liu and Kisung Lee and Naoki Abe and Calton Pu and Mu Qiao and Nesreen Ahmed and Donald Kossmann and Jeffrey Saltz and Jiliang Tang and Jingrui He and Huan Liu and Xiaohua Hu",
booktitle = "Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018",

}

TY - GEN

T1 - Improving Data Science Projects by Enriching Analytical Models with Domain Knowledge

AU - Zhang, Heng

AU - Roy, Utpal

AU - Saltz, Jeffrey

PY - 2019/1/22

Y1 - 2019/1/22

N2 - Domain knowledge is very important to support the development of analytic models. However, in today's data science projects, domain knowledge is typically documented, but not captured and integrated with the actual analytic model. This raises problems in interoperability and traceability of the relevant domain knowledge that is used to develop an analytic model. To address this challenge, this paper proposes a Knowledge Enriched Analytic Model (KEAM) to enrich analytic models with domain knowledge. To explore the proposed methodology and its benefits, a case study explores the utilization of KEAM to support the development of a Bayesian Network model within the smart manufacturing domain. The case study shows that the efficiency in developing an analytic model is improved by using the proposed KEAM.

AB - Domain knowledge is very important to support the development of analytic models. However, in today's data science projects, domain knowledge is typically documented, but not captured and integrated with the actual analytic model. This raises problems in interoperability and traceability of the relevant domain knowledge that is used to develop an analytic model. To address this challenge, this paper proposes a Knowledge Enriched Analytic Model (KEAM) to enrich analytic models with domain knowledge. To explore the proposed methodology and its benefits, a case study explores the utilization of KEAM to support the development of a Bayesian Network model within the smart manufacturing domain. The case study shows that the efficiency in developing an analytic model is improved by using the proposed KEAM.

KW - analytic model

KW - interoperability

KW - knowledge

KW - smart manufacturing

KW - traceability

UR - http://www.scopus.com/inward/record.url?scp=85062588148&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062588148&partnerID=8YFLogxK

U2 - 10.1109/BigData.2018.8622364

DO - 10.1109/BigData.2018.8622364

M3 - Conference contribution

AN - SCOPUS:85062588148

T3 - Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018

SP - 2828

EP - 2837

BT - Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018

A2 - Song, Yang

A2 - Liu, Bing

A2 - Lee, Kisung

A2 - Abe, Naoki

A2 - Pu, Calton

A2 - Qiao, Mu

A2 - Ahmed, Nesreen

A2 - Kossmann, Donald

A2 - Saltz, Jeffrey

A2 - Tang, Jiliang

A2 - He, Jingrui

A2 - Liu, Huan

A2 - Hu, Xiaohua

PB - Institute of Electrical and Electronics Engineers Inc.

ER -