Giter VIP home page Giter VIP logo

fedvc's Introduction

FedVC

Virtual Clients for Federated Learning with Imbalanced Distribution

We are in an early-release beta. Expect some adventures and rough edges.

Table of Contents

Introduction

Federated Learning (FL) has emerged as a practical paradigm for collaborative training under privacy and communication constraints. However, FL faces various challenges due to the diversity of edge devices. In addition to differences in computing performance and communication efficiency, data distributed on these devices is often imbalanced. While there have been numerous studies on these problems, many introduce additional hyperparameters and communication costs. To address these challenges, we propose an efficient FL method based on virtual clients, a new training framework we have developed. Our approach fills the gap between balanced and imbalanced data distribution by using virtual clients. Our algorithm outperforms classical FedAvg and the state-of-the-art for non-IID data in extensive experiments. Particularly in highly heterogeneous settings, FedVC demonstrates significantly more stable and accurate convergence behavior than other approaches.

Motivation

The approach to dividing data in federated learning is diverse. Assuming that there is an optimal data division, $\mathcal{D}^{\star}$, which can achieve the best convergence results (i.e., the highest accuracy and best generalization), the goal is to approximate the effect of $\mathcal{D}^{\star}$ in all client training processes. It can be considered that the training process of $\mathcal{D}^{\star}$ achieving good convergence is static, while in another division $\mathcal{D}$, we can try dynamic training methods to approximate the effect of $\mathcal{D}^{\star}$. Further discussion is needed to fully understand dynamic training and its relationship with the $\mathcal{D}^{\star}$ algorithm.

To begin, let's consider the $\mathcal{D}^{\star}$ algorithm. In many open source artificial intelligence datasets, such as CIFAR10 and CIFAR100, the data is uniformly distributed with an equal number of samples for each label. However, when the distribution is non-uniform, the appropriate weights for each sample in $\mathcal{D}^{\star}$ become unknown. Fortunately, there are existing research studies, such as MW-Net, that address this issue by automatically determining the appropriate weights.

fedvc's People

Watchers

卡尔 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.