TY - GEN
T1 - DeepAdapter
T2 - 38th IEEE Conference on Computer Communications, INFOCOM 2020
AU - Huang, Yakun
AU - Qiao, Xiuquan
AU - Tang, Jian
AU - Ren, Pei
AU - Liu, Ling
AU - Pu, Calton
AU - Chen, Junliang
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Deep learning shows great promise in providing more intelligence to the mobile web, but insufficient infrastructure, heavy models, and intensive computation limit the use of deep learning in mobile web applications. In this paper, we present DeepAdapter, a collaborative framework that ties the mobile web with an edge server and a remote cloud server to allow executing deep learning on the mobile web with lower processing latency, lower mobile energy, and higher system throughput. DeepAdapter provides a context-aware pruning algorithm that incorporates the latency, the network condition and the computing capability of the mobile device to fit the resource constraints of the mobile web better. It also provides a model cache update mechanism improving the model request hit rate for mobile web users. At runtime, it matches an appropriate model with the mobile web user and provides a collaborative mechanism to ensure accuracy. Our results show that DeepAdapter decreases average latency by 1.33x, reduces average mobile energy consumption by 1.4x, and improves system throughput by 2.1x with a considerable accuracy. Its contextaware pruning algorithm also improves inference accuracy by up to 0.3% with a smaller and faster model.
AB - Deep learning shows great promise in providing more intelligence to the mobile web, but insufficient infrastructure, heavy models, and intensive computation limit the use of deep learning in mobile web applications. In this paper, we present DeepAdapter, a collaborative framework that ties the mobile web with an edge server and a remote cloud server to allow executing deep learning on the mobile web with lower processing latency, lower mobile energy, and higher system throughput. DeepAdapter provides a context-aware pruning algorithm that incorporates the latency, the network condition and the computing capability of the mobile device to fit the resource constraints of the mobile web better. It also provides a model cache update mechanism improving the model request hit rate for mobile web users. At runtime, it matches an appropriate model with the mobile web user and provides a collaborative mechanism to ensure accuracy. Our results show that DeepAdapter decreases average latency by 1.33x, reduces average mobile energy consumption by 1.4x, and improves system throughput by 2.1x with a considerable accuracy. Its contextaware pruning algorithm also improves inference accuracy by up to 0.3% with a smaller and faster model.
UR - http://www.scopus.com/inward/record.url?scp=85090284646&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090284646&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM41043.2020.9155379
DO - 10.1109/INFOCOM41043.2020.9155379
M3 - Conference contribution
AN - SCOPUS:85090284646
T3 - Proceedings - IEEE INFOCOM
SP - 834
EP - 843
BT - INFOCOM 2020 - IEEE Conference on Computer Communications
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 6 July 2020 through 9 July 2020
ER -