An efficient convolutional neural network for coronary heart disease prediction

Aniruddha Dutta, Tamal Batabyal, Meheli Basu, Scott T. Acton

Research output: Contribution to journalArticlepeer-review

172 Scopus citations

Abstract

This study proposes an efficient neural network with convolutional layers to classify significantly class-imbalanced clinical data. The data is curated from the National Health and Nutritional Examination Survey (NHANES) with the goal of predicting the occurrence of Coronary Heart Disease (CHD). While the majority of the existing machine learning models that have been used on this class of data are vulnerable to class imbalance even after the adjustment of class-specific weights, our simple two-layer CNN exhibits resilience to the imbalance with fair harmony in class-specific performance. Given a highly imbalanced dataset, it is often challenging to simultaneously achieve a high class 1 (true CHD prediction rate) accuracy along with a high class 0 accuracy, as the test data size increases. We adopt a two-step approach: first, we employ least absolute shrinkage and selection operator (LASSO) based feature weight assessment followed by majority-voting based identification of important features. Next, the important features are homogenized by using a fully connected layer, a crucial step before passing the output of the layer to successive convolutional stages. We also propose a training routine per epoch, akin to a simulated annealing process, to boost the classification accuracy. Despite a high class imbalance in the NHANES dataset, the investigation confirms that our proposed CNN architecture has the classification power of 77% to correctly classify the presence of CHD and 81.8% to accurately classify the absence of CHD cases on a testing data, which is 85.70% of the total dataset. This result signifies that the proposed architecture can be generalized to other studies in healthcare with a similar order of features and imbalances. While the recall values obtained from other machine learning methods, such as SVM and random forest, are comparable to that of our proposed CNN model, our model predicts the negative (Non-CHD) cases with higher accuracy. Our model architecture exhibits a way forward to develop better investigative tools, improved medical treatment and lower diagnostic costs by incorporating a smart diagnostic system in the healthcare system. The balanced accuracy of our model (79.5%) is also better than individual accuracies of SVM or random forest classifiers. The CNN classifier results in high specificity and test accuracy along with high values of recall and area under the curve (AUC).

Original languageEnglish (US)
Article number113408
JournalExpert Systems with Applications
Volume159
DOIs
StatePublished - Nov 30 2020
Externally publishedYes

Keywords

  • Artificial Intelligence
  • Convolutional neural network
  • Coronary heart disease
  • LASSO regression
  • Machine learning
  • NHANES

ASJC Scopus subject areas

  • General Engineering
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'An efficient convolutional neural network for coronary heart disease prediction'. Together they form a unique fingerprint.

Cite this