基于联邦学习的混合采样算法
首发时间:2021-04-15
摘要:随着机器学习不断的发展,其应用已经覆盖到我们生活的方方面面,但往往局限于缺少数据或数据质量问题。联邦学习是一种旨在保证数据隐私安全,通过分散的数据集来进行模型训练的一种机器学习方法,可以有效地扩充模型训练数据。但是,各客户端上训练数据不同的样本分布和数据质量可能会对联邦学习最终的模型产生不良影响。因此解决联邦学习中的数据不均衡问题十分关键。本文提出一种联邦混合采样算法,借助联邦学习模型训练的优势,引入高斯混合模型,同时结合基于聚类的多数类降采样来获得样本均衡的训练数据,从而通过提升各局部模型的性能来提高全局模型的效果。通过对比实验,结果表明该方法可以有效处理联邦学习下的数据不均衡问题。
For information in English, please click here
Mixed sampling algorithm based on federated learning
Abstract:With the continuous development of machine learning, its applications have covered all aspects of our lives, but they are often limited to lack of data or data quality issues. Federated learning is a machine learning method designed to ensure data privacy and security, and model training through scattered data sets, which can effectively expand model training data. However, the different sample distributions and data quality of the training data on the client may adversely affect the final model of federated learning. Therefore, it is very important to solve the problem of data imbalance in federated learning. We propose a federated mixed sampling algorithm. With the advantage of federated learning model training, we introduce a Gaussian mixture model to fit minority sample distributions that are closer to the real distribution. At the same time, cluster-based downsampling of most classes is used to obtain sample balanced training data, thereby improving the effect of the global model by improving the performance of each local model. Through comparative experiments, the results show that this method can effectively deal with the problem of data imbalance under federated learning.
Keywords: computer application technology machine learning federated learning
基金:
引用
No.****
同行评议
共计0人参与
勘误表
基于联邦学习的混合采样算法
评论
全部评论0/1000