TY - GEN
T1 - DeepAdapter
T2 - 38th IEEE Conference on Computer Communications, INFOCOM 2020
AU - Huang, Yakun
AU - Qiao, Xiuquan
AU - Tang, Jian
AU - Ren, Pei
AU - Liu, Ling
AU - Pu, Calton
AU - Chen, Junliang
N1 - Funding Information:
Yakun Huang, Xiuquan Qiao, Pei Ren and Junliang Chen are with State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, 100876, China. Email:{ykhuang, qiaoxq, renpei, cjl}@bupt.edu.cn. Xiuquan Qiao is the corresponding author. Jian Tang is with Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244, USA. Email:jtang02@syr.edu. Ling Liu and Calton Pu are with School of Computer Science, Georgia Institute of Technology, Atlanta, GA 30332, USA. Email:{ling.liu,calton.pu}@cc.gatech.edu. This research was supported in part by the National Key R&D Program of China under Grant 2018YFE0205503, in part by the National Natural Science Foundation of China (NSFC) under Grant 61671081, in part by the Funds for International Cooperation and Exchange of NSFC under Grant 61720106007, in part by the 111 Project under Grant B18008, in part by the Fundamental Research Funds for the Central Universities under Grant 2018XKJC01, and in part by the BUPT Excellent Ph.D. Students Foundation under Grant CX2019135.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Deep learning shows great promise in providing more intelligence to the mobile web, but insufficient infrastructure, heavy models, and intensive computation limit the use of deep learning in mobile web applications. In this paper, we present DeepAdapter, a collaborative framework that ties the mobile web with an edge server and a remote cloud server to allow executing deep learning on the mobile web with lower processing latency, lower mobile energy, and higher system throughput. DeepAdapter provides a context-aware pruning algorithm that incorporates the latency, the network condition and the computing capability of the mobile device to fit the resource constraints of the mobile web better. It also provides a model cache update mechanism improving the model request hit rate for mobile web users. At runtime, it matches an appropriate model with the mobile web user and provides a collaborative mechanism to ensure accuracy. Our results show that DeepAdapter decreases average latency by 1.33x, reduces average mobile energy consumption by 1.4x, and improves system throughput by 2.1x with a considerable accuracy. Its contextaware pruning algorithm also improves inference accuracy by up to 0.3% with a smaller and faster model.
AB - Deep learning shows great promise in providing more intelligence to the mobile web, but insufficient infrastructure, heavy models, and intensive computation limit the use of deep learning in mobile web applications. In this paper, we present DeepAdapter, a collaborative framework that ties the mobile web with an edge server and a remote cloud server to allow executing deep learning on the mobile web with lower processing latency, lower mobile energy, and higher system throughput. DeepAdapter provides a context-aware pruning algorithm that incorporates the latency, the network condition and the computing capability of the mobile device to fit the resource constraints of the mobile web better. It also provides a model cache update mechanism improving the model request hit rate for mobile web users. At runtime, it matches an appropriate model with the mobile web user and provides a collaborative mechanism to ensure accuracy. Our results show that DeepAdapter decreases average latency by 1.33x, reduces average mobile energy consumption by 1.4x, and improves system throughput by 2.1x with a considerable accuracy. Its contextaware pruning algorithm also improves inference accuracy by up to 0.3% with a smaller and faster model.
UR - http://www.scopus.com/inward/record.url?scp=85090284646&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090284646&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM41043.2020.9155379
DO - 10.1109/INFOCOM41043.2020.9155379
M3 - Conference contribution
AN - SCOPUS:85090284646
T3 - Proceedings - IEEE INFOCOM
SP - 834
EP - 843
BT - INFOCOM 2020 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 6 July 2020 through 9 July 2020
ER -