7.5 A 65nm 0.39-to-140.3TOPS/W 1-to-12b Unified Neural Network Processor Using Block-Circulant-Enabled Transpose-Domain Acceleration with 8.1 × Higher TOPS/mm2and 6T HBST-TRAM-Based 2D Data-Reuse Architecture

Jinshan Yue, Ruoyang Liu, Wenyu Sun, Zhe Yuan, Zhibo Wang, Yung Ning Tu, Yi Ju Chen, Ao Ren, Yanzhi Wang, Meng Fan Chang, Xueqing Li, Huazhong Yang, Yongpan Liu

Research output: Chapter in Book/Entry/PoemConference contribution

43 Scopus citations

Abstract

Energy-efficient neural-network (NN) processors have been proposed for battery-powered deep-learning applications, where convolutional (CNN), fully-connected (FC) and recurrent NNs (RNN) are three major workloads. To support all of them, previous solutions [1-3] use either area-inefficient heterogeneous architectures, including CNN and RNN cores, or an energy-inefficient reconfigurable architecture. A block-circulant algorithm [4] can unify CNN/FC/RNN workloads with transpose-domain acceleration, as shown in Fig. 7.5.1. Once NN weights are trained using the block-circulant pattern, all workloads are transformed into consistent matrix-vector multiplications (MVM), which can potentially achieve 8 to-128× storage savings and a O({n}{2})-to-O(nlog(n)) computation complexity reduction.

Original languageEnglish (US)
Title of host publication2019 IEEE International Solid-State Circuits Conference, ISSCC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages138-140
Number of pages3
ISBN (Electronic)9781538685310
DOIs
StatePublished - Mar 6 2019
Event2019 IEEE International Solid-State Circuits Conference, ISSCC 2019 - San Francisco, United States
Duration: Feb 17 2019Feb 21 2019

Publication series

NameDigest of Technical Papers - IEEE International Solid-State Circuits Conference
Volume2019-February
ISSN (Print)0193-6530

Conference

Conference2019 IEEE International Solid-State Circuits Conference, ISSCC 2019
Country/TerritoryUnited States
CitySan Francisco
Period2/17/192/21/19

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of '7.5 A 65nm 0.39-to-140.3TOPS/W 1-to-12b Unified Neural Network Processor Using Block-Circulant-Enabled Transpose-Domain Acceleration with 8.1 × Higher TOPS/mm2and 6T HBST-TRAM-Based 2D Data-Reuse Architecture'. Together they form a unique fingerprint.

Cite this