联邦学习是一种允许工作节点在不共享本地数据的前提下进行协同训练的框架。由于节点的分散性,系统异构性与数据异构性严重影响了联邦学习系统训练得到的模型的性能和收敛速度。针对较严重的异构性情况,已有研究提出个性化联邦学习的模式。区别于聚合得到一个全局模型的传统联邦学习模式,个性化联邦学习允许每个系统内节点各自持有一个最适于本地数据特征和系统资源情况的模型。然而,现有的工作没有意识到允许节点各自训练不同结构的模型所带来的增益,也没有有效的方法同时解决模型结构分配和客户端知识共享的问题。如何设计支持异构模型训练的个性化联邦学习系统,在设计符合客户端特征的模型架构的同时最大化客户端本地的推理收益,成了亟待解决的问题。 本文的主要工作与贡献如下: 一、提出基于网络架构搜索实现模型异构的联邦学习:为了使模型架构与客户端本地数据分布相匹配的同时实现异构模型间的知识共享,本文提出借助可微网络架构搜索技术辅助异构模型的训练。该方法能够在相近的收敛时间内克服数据异构的影响,实现更高的模型推理精度,完成更高效率的知识共享。 二、提出基于模型结构相似度聚类的个性化联邦学习:基于联邦学习系统内节点的数据分布成簇相似的假设,本文首先测量了数据分布相似性与模型参数、模型架构相似性之间的关系,并得出了模型参数与架构参数的耦合结果与数据分布的相关性最大,可用于指导客户端节点的聚类。在该结论的基础上,本文设计了基于模型结构相似度聚类的个性化联邦学习算法DC-FedNAS,通过挖掘客户端之间的相似性实现更高的模型推理精度。 三、提出模型聚合权重自定义的个性化联邦学习:本文提出模型聚合权重自定义的个性化联邦学习算法FedMH,它允许服务器节点根据客户端节点的推理结果判断客户端之间的相关程度,从而实现对不同客户端节点生成不同的聚合权重,避免了相同的平均聚合权重下客户端模型的同构性。在多个图像分类数据集上的验证结果显示FedMH可以更准确地捕捉客户端节点对之间的相关性,实现更高的模型推理精度。
Federated learning is a framework that allows working nodes to collaboratively train models without sharing local data. Due to the decentralization of nodes, heterogeneity of the system, and heterogeneity of data, the performance and convergence speed of the models trained by federated learning systems are severely affected. To address severe heterogeneity, personalized federated learning has been proposed. In contrast to the traditional federated learning mode of aggregating a global model, personalized federated learning allows each node within the system to hold a model that is best suited to its local data features and system resources. However, existing work has not recognized the benefits of allowing nodes to train models with different structures, and there are no effective methods to simultaneously address the issues of model structure allocation and client knowledge sharing. Designing a personalized federated learning system that supports heterogeneous model training, while maximizing the local inference benefits of clients by designing model architectures that fit their client features, has become an urgent problem to be solved. The main work and contributions of this thesis are listed as follows. First, this thesis proposes to achieve federated learning with model heterogeneity based on network architecture search. In order to match the model architecture with the local data distribution of clients while achieving knowledge sharing between heterogeneous models, this thesis proposes to use differentiable network architecture search technology to assist the training of heterogeneous models. This method can overcome the impact of data heterogeneity and achieve higher model inference accuracy, while completing efficient knowledge sharing in similar convergence time. Second, this thesis proposes a personalized federated learning algorithm based on model structure similarity clustering. Based on the assumption that the data distribution of nodes in the federated learning system can be clustered, this thesis first measures the relationship between data distribution similarity and model parameter and architecture similarity, and concludes that the coupling result of model parameters and architecture parameters is most correlated with data distribution, which can be used to guide the clustering of client nodes. Based on this conclusion, this thesis designs a personalized federated learning algorithm called DC-FedNAS based on model structure similarity clustering, which achieves higher model inference accuracy by mining the similarity between client nodes. Third, this thesis proposes a personalized federated learning algorithm with adaptive model aggregation weights. This thesis proposes a personalized federated learning algorithm called FedMH with adaptive model aggregation weights. It allows the server node to judge the relevance between client nodes based on their inference results, thereby generating different aggregation weights for different client nodes to avoid the homogeneity of client models under the same average aggregation weight. The verification results on multiple image classification datasets show that FedMH can achieve higher model inference accuracy and capture the correlation between client nodes.