VqSGD: Vector Quantized Stochastic Gradient Descent

Venkata Gandikota, Daniel Kane, Raj Kumar Maity, Arya Mazumdar

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

In this work, we present a family of vector quantization schemes vqSGD (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: \Theta \left({\frac {d}{R2}}}\right) bits are necessary and sufficient (up to an additive O(\log d) term) to describe an unbiased estimator \hat{\boldsymbol {g}}(\boldsymbol {g}) for any \boldsymbol {g} in the d -dimensional unit sphere, under the constraint that g 2 R almost surely, R > 1. In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a d -dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require o(d) bits of communication at the expense of tolerable increase in error. The instances of our quantization scheme are obtained using well-known families of binary error-correcting codes and provide a smooth tradeoff between the communication and the estimation error of quantization. Furthermore, we show that vqSGD also offers automatic privacy guarantees.

Original languageEnglish (US)
Pages (from-to)4573-4587
Number of pages15
JournalIEEE Transactions on Information Theory
Volume68
Issue number7
DOIs
StatePublished - Jul 1 2022
Externally publishedYes

Keywords

  • Vector quantization
  • communication efficiency
  • mean estimation
  • stochastic gradient descent (SGD)

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'VqSGD: Vector Quantized Stochastic Gradient Descent'. Together they form a unique fingerprint.

Cite this