TY - JOUR
T1 - vqSGD
T2 - 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021
AU - Gandikota, Venkata
AU - Kane, Daniel
AU - Maity, Raj Kumar
AU - Mazumdar, Arya
N1 - Funding Information:
This research is supported in part by NSF awards CCF 1642658, CCF 1934846, and CCF 1909046.
Funding Information:
Acknowledgements This research is supported in part by NSF awards CCF 1642658, CCF 1934846, and CCF 1909046.
Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - In this work, we present a family of vector quantization schemes vqSGD (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: ⊖(Rd2 ) bits are necessary and sufficient (up to an additive O(log d) term) to describe an unbiased estimator ĝ(g) for any g in the ddimensional unit sphere, under the constraint that kĝ(g)k2 ≤ R almost surely. In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a d-dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require only o(d) bits of communication at the expense of tolerable increase in error. The instances of our quantization scheme are obtained using the properties of binary error-correcting codes and provide a smooth tradeoff between the communication and the estimation error of quantization. Furthermore, we show that vqSGD also offers some automatic privacy guarantees.
AB - In this work, we present a family of vector quantization schemes vqSGD (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: ⊖(Rd2 ) bits are necessary and sufficient (up to an additive O(log d) term) to describe an unbiased estimator ĝ(g) for any g in the ddimensional unit sphere, under the constraint that kĝ(g)k2 ≤ R almost surely. In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a d-dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require only o(d) bits of communication at the expense of tolerable increase in error. The instances of our quantization scheme are obtained using the properties of binary error-correcting codes and provide a smooth tradeoff between the communication and the estimation error of quantization. Furthermore, we show that vqSGD also offers some automatic privacy guarantees.
UR - http://www.scopus.com/inward/record.url?scp=85119896091&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119896091&partnerID=8YFLogxK
M3 - Conference Article
AN - SCOPUS:85119896091
SN - 2640-3498
VL - 130
SP - 2197
EP - 2205
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
Y2 - 13 April 2021 through 15 April 2021
ER -