FeDDkw – Federated Learning with Dynamic Kullback–Leibler-divergence Weight

Bibliographic Information

Published
2023-04-28
DOI
  • 10.1145/3594779
Publisher
Association for Computing Machinery (ACM)

Search this article

Description

<jats:p> Federated learning (FL) has emerged as a promising framework for collaborative machine learning. As one of the most well-known bottlenecks of FL, data heterogeneity, i.e., non-IID data, has seriously hampered the convergence rate and model accuracy of FL. Although there are many works that aim to deal with this problem, none of them have directly exploited the data heterogeneity information, and thus cannot handle the problem effectively in general. In this paper, we try to answer the following fundamental question, whether this data heterogeneity information can be utilized to enhance the system performance. To this end, we propose <jats:sc>FedDkw</jats:sc> – Federated Learning with Dynamic KL-divergence Weight. Specifically, in every round, while uploading the model to the server, each participated client also piggybacks the distribution of the training data. The server maintains a global distribution, which is updated after receiving the distributions of all clients in each round. Then the weight of each client <jats:italic>i</jats:italic> is proportional to the <jats:inline-formula content-type="math/tex"> <jats:tex-math notation="TeX" version="MathJaX">\(\frac{1}{KL(i)} \)</jats:tex-math> </jats:inline-formula> , where <jats:italic>KL</jats:italic> ( <jats:italic>i</jats:italic> ) is the Kullback-Leibler (KL) divergence between the client’s distribution and the global counterpart. This indicates that the clients whose distribution is closer to the global one should be assigned with a larger weight, which coincides with our intuition. Furthermore, since uploading the clients’ data distribution in <jats:sc>FedDkw</jats:sc> may bring about potential security risks, we further propose <jats:sc>FedDkw++</jats:sc> to avoid this procedure. In particular, after each client uploads its model, the server uses its local data as the input to the model, and the obtained outcome is used as an estimation of the client’s distribution (we name this method “data distribution inference”). We have conducted extensive experiments in various scenarios. The results show that our algorithm can significantly accelerate the convergence rate than the current state-of-art algorithms under heterogeneous data. </jats:p>

Journal

Details 詳細情報について

  • CRID
    1872835442519044736
  • DOI
    10.1145/3594779
  • ISSN
    23754702
    23754699
  • Data Source
    • OpenAIRE

Report a problem

Back to top