Normalization and dropout for stochastic computing-based deep convolutional neural networks

Ji Li, Zihao Yuan, Zhe Li, Ao Ren, Caiwen Ding, Jeffrey Draper, Shahin Nazarian, Qinru Qiu, Bo Yuan, Yanzhi Wang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Recently, Deep Convolutional Neural Network (DCNN) has been recognized as the most effective model for pattern recognition and classification tasks. With the fast growing Internet of Things (IoTs) and wearable devices, it becomes attractive to implement DCNNs in embedded and portable systems. However, novel computing paradigms are urgently required to deploy DCNNs that have huge power consumptions and complex topologies in systems with limited area and power supply. Recent works have demonstrated that Stochastic Computing (SC) can radically simplify the hardware implementation of arithmetic units and has the potential to bring the success of DCNNs to embedded systems. This paper introduces normalization and dropout, which are essential techniques for the state-of-the-art DCNNs, to the existing SC-based DCNN frameworks. In this work, the feature extraction block of DCNNs is implemented using an approximate parallel counter, a near-max pooling block and an SC-based rectified linear activation unit. A novel SC-based normalization design is proposed, which includes a square and summation unit, an activation unit and a division unit. The dropout technique is integrated into the training phase and the learned weights are adjusted during the hardware implementation. Experimental results on AlexNet with the ImageNet dataset show that the SC-based DCNN with the proposed normalization and dropout techniques achieves 3.26% top-1 accuracy improvement and 3.05% top-5 accuracy improvement compared with the SC-based DCNN without these two essential techniques, confirming the effectiveness of our normalization and dropout designs.

Original languageEnglish (US)
JournalIntegration
DOIs
StateAccepted/In press - Jan 1 2017

Fingerprint

Neural networks
Pattern recognition
Chemical activation
Hardware
Embedded systems
Feature extraction
Electric power utilization
Topology
Internet of things

Keywords

  • Deep convolutional neural networks
  • Deep learning
  • Dropout
  • Normalization

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Normalization and dropout for stochastic computing-based deep convolutional neural networks. / Li, Ji; Yuan, Zihao; Li, Zhe; Ren, Ao; Ding, Caiwen; Draper, Jeffrey; Nazarian, Shahin; Qiu, Qinru; Yuan, Bo; Wang, Yanzhi.

In: Integration, 01.01.2017.

Research output: Contribution to journalArticle

Li, Ji ; Yuan, Zihao ; Li, Zhe ; Ren, Ao ; Ding, Caiwen ; Draper, Jeffrey ; Nazarian, Shahin ; Qiu, Qinru ; Yuan, Bo ; Wang, Yanzhi. / Normalization and dropout for stochastic computing-based deep convolutional neural networks. In: Integration. 2017.
@article{cbbbfbad8eb543f5af4f6a1c22b204dd,
title = "Normalization and dropout for stochastic computing-based deep convolutional neural networks",
abstract = "Recently, Deep Convolutional Neural Network (DCNN) has been recognized as the most effective model for pattern recognition and classification tasks. With the fast growing Internet of Things (IoTs) and wearable devices, it becomes attractive to implement DCNNs in embedded and portable systems. However, novel computing paradigms are urgently required to deploy DCNNs that have huge power consumptions and complex topologies in systems with limited area and power supply. Recent works have demonstrated that Stochastic Computing (SC) can radically simplify the hardware implementation of arithmetic units and has the potential to bring the success of DCNNs to embedded systems. This paper introduces normalization and dropout, which are essential techniques for the state-of-the-art DCNNs, to the existing SC-based DCNN frameworks. In this work, the feature extraction block of DCNNs is implemented using an approximate parallel counter, a near-max pooling block and an SC-based rectified linear activation unit. A novel SC-based normalization design is proposed, which includes a square and summation unit, an activation unit and a division unit. The dropout technique is integrated into the training phase and the learned weights are adjusted during the hardware implementation. Experimental results on AlexNet with the ImageNet dataset show that the SC-based DCNN with the proposed normalization and dropout techniques achieves 3.26{\%} top-1 accuracy improvement and 3.05{\%} top-5 accuracy improvement compared with the SC-based DCNN without these two essential techniques, confirming the effectiveness of our normalization and dropout designs.",
keywords = "Deep convolutional neural networks, Deep learning, Dropout, Normalization",
author = "Ji Li and Zihao Yuan and Zhe Li and Ao Ren and Caiwen Ding and Jeffrey Draper and Shahin Nazarian and Qinru Qiu and Bo Yuan and Yanzhi Wang",
year = "2017",
month = "1",
day = "1",
doi = "10.1016/j.vlsi.2017.11.002",
language = "English (US)",
journal = "Integration, the VLSI Journal",
issn = "0167-9260",
publisher = "Elsevier",

}

TY - JOUR

T1 - Normalization and dropout for stochastic computing-based deep convolutional neural networks

AU - Li, Ji

AU - Yuan, Zihao

AU - Li, Zhe

AU - Ren, Ao

AU - Ding, Caiwen

AU - Draper, Jeffrey

AU - Nazarian, Shahin

AU - Qiu, Qinru

AU - Yuan, Bo

AU - Wang, Yanzhi

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Recently, Deep Convolutional Neural Network (DCNN) has been recognized as the most effective model for pattern recognition and classification tasks. With the fast growing Internet of Things (IoTs) and wearable devices, it becomes attractive to implement DCNNs in embedded and portable systems. However, novel computing paradigms are urgently required to deploy DCNNs that have huge power consumptions and complex topologies in systems with limited area and power supply. Recent works have demonstrated that Stochastic Computing (SC) can radically simplify the hardware implementation of arithmetic units and has the potential to bring the success of DCNNs to embedded systems. This paper introduces normalization and dropout, which are essential techniques for the state-of-the-art DCNNs, to the existing SC-based DCNN frameworks. In this work, the feature extraction block of DCNNs is implemented using an approximate parallel counter, a near-max pooling block and an SC-based rectified linear activation unit. A novel SC-based normalization design is proposed, which includes a square and summation unit, an activation unit and a division unit. The dropout technique is integrated into the training phase and the learned weights are adjusted during the hardware implementation. Experimental results on AlexNet with the ImageNet dataset show that the SC-based DCNN with the proposed normalization and dropout techniques achieves 3.26% top-1 accuracy improvement and 3.05% top-5 accuracy improvement compared with the SC-based DCNN without these two essential techniques, confirming the effectiveness of our normalization and dropout designs.

AB - Recently, Deep Convolutional Neural Network (DCNN) has been recognized as the most effective model for pattern recognition and classification tasks. With the fast growing Internet of Things (IoTs) and wearable devices, it becomes attractive to implement DCNNs in embedded and portable systems. However, novel computing paradigms are urgently required to deploy DCNNs that have huge power consumptions and complex topologies in systems with limited area and power supply. Recent works have demonstrated that Stochastic Computing (SC) can radically simplify the hardware implementation of arithmetic units and has the potential to bring the success of DCNNs to embedded systems. This paper introduces normalization and dropout, which are essential techniques for the state-of-the-art DCNNs, to the existing SC-based DCNN frameworks. In this work, the feature extraction block of DCNNs is implemented using an approximate parallel counter, a near-max pooling block and an SC-based rectified linear activation unit. A novel SC-based normalization design is proposed, which includes a square and summation unit, an activation unit and a division unit. The dropout technique is integrated into the training phase and the learned weights are adjusted during the hardware implementation. Experimental results on AlexNet with the ImageNet dataset show that the SC-based DCNN with the proposed normalization and dropout techniques achieves 3.26% top-1 accuracy improvement and 3.05% top-5 accuracy improvement compared with the SC-based DCNN without these two essential techniques, confirming the effectiveness of our normalization and dropout designs.

KW - Deep convolutional neural networks

KW - Deep learning

KW - Dropout

KW - Normalization

UR - http://www.scopus.com/inward/record.url?scp=85059048068&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059048068&partnerID=8YFLogxK

U2 - 10.1016/j.vlsi.2017.11.002

DO - 10.1016/j.vlsi.2017.11.002

M3 - Article

AN - SCOPUS:85059048068

JO - Integration, the VLSI Journal

JF - Integration, the VLSI Journal

SN - 0167-9260

ER -