News

Federated Learning in NOUS: Pioneering Privacy-Preserving AI for Decentralized Insights

Authors: Andreas El Saer, Petrina Troulitaki

In a world where data privacy is paramount, the demand for innovative solutions that balance collaboration, and security has never been greater. Enter Federated Learning (FL), a groundbreaking approach to decentralised machine learning that empowers devices to learn collectively without disclosing any sensitive information. 

Federated Learning (FL) is a decentralised machine learning approach that enables the collaborative training of a shared model across multiple devices or entities without the need to exchange raw data. In contrast to traditional centralised approaches where data are transferred to a central server, FL keeps the data locally on individual devices while sharing only model updates, such as weights or gradients, with a central aggregator. In this way, FL addresses the fundamental problems of privacy, ownership, scalability, and performance while leveraging the collective insights from distributed datasets. 

FL can be categorised [1] into Model-Centric and Data-Centric approaches. In the more prevalent Model-Centric FL, the focus is on improving a central model using distributed data. This category includes Cross-Device FL, where millions of devices collaborate to enhance a model (e.g., Google’s implementation), and Cross-Silo FL, where fewer organisations (such as hospitals or banks) jointly train a shared model while retaining local sensitive data. Additionally, FL can be classified in terms of data partitioning: Horizontal FL, where data sets share the same features but differ in samples (e.g., across mobile devices), and Vertical FL, where data sets share the same samples but differ in features (e.g., banks and retailers collaborating). Emerging forms like Federated Transfer Learning and Data-Centric FL extend the possibilities, focusing on small data problems and enabling external organisations to build models on private data without access to the data itself.

Key techniques underpinning FL’s privacy and functionality include Differential Privacy (introducing noise to updates to obscure sensitive information), Secure Aggregation (ensuring updates cannot be reverse engineered), and Private Set Intersection (aligning features across distributed data sets). These advancements allow FL to unlock the potential of decentralised data without compromising individual privacy. 

Regarding NOUS project, FL aims to allow insights driven by data in domains like energy. More specifically, FL is integrated for the purpose of offering decentralised, privacy-preserving machine learning capabilities geared to the specific needs of edge environments. Instead of traditional centralised models, Federated Learning enables each edge node, or device participating in the federation, to train models locally using their private data, and sharing only the updates needed to affect global model training. Within the NOUS project, this is instrumental for the deployment of personalised Federated Learning [2], where the primary model is first adapted on a global scale, and then modified for additional context-based parameters and unique datasets from distinct edge nodes. Due to the consideration of locality features, NOUS is able to ensure that generated insights are not only accurate but also applicable to specific contexts such as patterns on energy consumption in certain regions.

Up until now, key milestones include identifying FL’s role in pivotal NOUS use cases. In these scenarios, FL’s ability to maintain data privacy while enabling collaborative learning among distributed devices is transformative. For instance, in energy prediction, FL could aggregate insights from geographically dispersed energy sources, weather patterns, and market data to deliver highly accurate predictions while ensuring that the privacy and data protection policies are respected.

Naturally, the implementation of FL in NOUS comes with its own set of challenges for consideration. A major issue is the inherent heterogeneity in data, devices, and network capabilities across the distributed nodes. To address these issues robust strategies for managing non-IID (non-independent and identically distributed) data are required, to ensure fault tolerance, and handle resource constraints on edge devices. To mitigate these challenges, NOUS has adopted innovative techniques, including adaptive model aggregation strategies that account for data variability and resource-aware training mechanisms. These solutions ensure that even resource-constrained devices can contribute to the global model. Additionally, by incorporating mechanisms like Differential Privacy or Secure Aggregation, NOUS enhances the security and confidentiality of the shared model updates.

Looking ahead, NOUS aims to refine its FL implementation to further enhance personalization and scalability. Key objectives include the development of more sophisticated model aggregation algorithms that optimise learning across diverse nodes and the exploration of novel privacy-preserving techniques to fortify data security. Anticipated milestones include the deployment of FL-driven solutions in pilot use cases, such as regional energy forecasting or perception of connected vehicles. These steps will pave the way for a scalable, privacy-conscious platform capable of addressing the evolving needs of NOUS stakeholders while setting a benchmark for future AI-driven edge systems.

References

[1] A. Gooday, “Understanding Federated Learning terminology,” OpenMined Blog, 21-Sep-2020. [Online]. Available: https://blog.openmined.org/federated-learning-types/. [Accessed: 18-Dec-2024].
[2] A. Z. Tan, H. Yu, L. Cui, and Q. Yang, “Towards personalized federated learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 12, pp. 9587–9603, 2023.