An Efficient Algorithm for Clustered Federated Learning using Nature-Inspired Optimization Technique
Loading...
Date
Authors
Supervisors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The emergence of Federated Learning (FL) and its enhanced version, Clustered Federated
Learning (CFL), marks an important milestone in developing privacy-preserved,
decentralized, and adaptive machine learning frameworks specifically for real-world applications.
Although they hold significant potential, the real-world implementation of FL
and CFL encounters challenges, specifically because of statistical heterogeneity, resource
variability, and static hyperparameter configurations that impede model convergence and
overall model performance. In CFL, while client clustering improves personalization and
model relevance, the performance of these systems has been impacted by the absence of
dynamic optimization algorithms designed for varying client conditions. These issues become
more prevalent in key areas like cross-institutional healthcare, IoT-based monitoring,
consumer IoT devices, and autonomous systems, where maintaining data privacy, preserving
adaptable systems, and minimizing communication overhead is significant. Changing
learning processes to diverse environments while maintaining privacy is a major challenge
for widespread implementation. This highlights the need for developing adaptive,
optimization-based CFL methods that can operate effectively in resource-limited and non-
IID environments.
This thesis handles these limitations through the development of nature-inspired optimization
based CFL approaches, particularly ReGen-CFL, GWO-CFL, and WoCFL, that
incorporate evolutionary and swarm-based methodologies for dynamically tuning the hyperparameters,
such as learning rates, at the cluster level. The proposed approaches use
effective clustering techniques, like DBSCAN and OPTICS, to cluster the clients according
to similarities in their training behaviors and data distributions. After forming the
clusters, the hyperparameters are optimized using different nature-inspired optimization
techniques for each cluster to better train and fine-tune the device models for better
performance. Through the iterative optimization of hyperparameters within every cluster,
the proposed methods attain improved performance, enhanced accuracy, and faster
convergence, in addition to minimum communication rounds required to achieve targeted
accuracy in non-iid settings, compared to traditional FL and CFL methods. Experimental
validation utilizing CIFAR-10, MNIST, and different healthcare datasets like HAM-10000,
Covid-19 indicates that the proposed methods are robust, scalable, and adaptable, consistently
attained improved performance across different configurations of local epochs and
batch sizes.
Furthermore, the proposed designs prioritize privacy preservation by making sure that
no data is shared throughout the training process. It improves communication efficacy
by attaining targeted accuracy in a lower communication round, minimizing the over head associated with decentralized learning. These CFL variants connect the theoretical
advantages of federated learning with the real-world challenges faced in heterogeneous,
resource-constrained settings. This study strengthens distributed learning systems by
facilitating adaptive, cluster-specific optimization while ensuring robust data security,
supporting CFL’s broader implementation in applications that require superior performance,
including better accuracy, convergence, communication efficacy, and robust data
privacy.
