Federated Learning
Federated learning is a machine learning paradigm in which a model is trained across multiple decentralised devices or servers holding local data, without exchanging the raw data itself, preserving privacy while enabling collaborative model improvement.
Federated learning is a machine learning paradigm in which a shared global model is trained collaboratively across multiple devices or institutional nodes, each of which retains its own data locally. Rather than transmitting raw data to a central server, participants compute model updates (gradients or weight deltas) on their local data and share only these updates for aggregation. The central server combines the updates — typically through a weighted average — to improve the global model, which is then redistributed to participants. Raw personal or sensitive data never leaves the originating device or institution.
The paradigm was formally proposed by Google researchers McMahan and colleagues in a 2016 paper titled "Communication-Efficient Learning of Deep Networks from Decentralized Data," which also coined the term federated learning and introduced the Federated Averaging (FedAvg) algorithm.[^1] Google subsequently deployed federated learning for keyboard next-word prediction on Android devices, providing one of the first large-scale production implementations.
How Federated Learning Works
A standard federated learning round proceeds as follows. The central server initialises a global model and distributes it to a selected subset of participating clients. Each client trains the model locally on its own dataset for one or more gradient steps, then transmits the resulting model update back to the server. The server aggregates these updates — often as a weighted average proportional to each client's dataset size — to produce an improved global model. This cycle repeats across many rounds until the model converges.[^1]
This architecture aligns naturally with data protection principles including data minimisation and purpose limitation, as individual records never traverse the network. However, it introduces new engineering challenges: heterogeneous client hardware (ranging from smartphones to hospital servers), non-identically distributed data across clients (statistical heterogeneity), unreliable client availability, and the risk that model updates themselves may leak information about local data through membership inference or gradient inversion attacks.
Privacy-Enhancing Techniques
Several complementary methods strengthen the privacy guarantees of federated learning.
Differential privacy adds calibrated random noise to model updates before they are transmitted to the server, providing a formal mathematical bound on how much information about any individual training record can be inferred from the shared update.[^2]
Secure aggregation uses cryptographic protocols so that the server computes the aggregate of client updates without ever seeing individual clients' updates in plaintext, protecting against a curious or compromised server.
Homomorphic encryption allows computations to be performed directly on encrypted model updates, though at significant computational cost.
These techniques involve trade-offs: adding noise or encryption overhead typically reduces model accuracy or increases communication cost, and practitioners must calibrate the privacy-utility balance for each deployment context.
Cross-Device vs. Cross-Silo Federated Learning
Federated learning deployments fall into two broad categories.
Cross-device federated learning involves large numbers of mobile or IoT devices — potentially millions of smartphones — each holding relatively small personal datasets. Participants are ephemeral and unreliable; any given device may drop out mid-round due to connectivity loss or battery constraints. Google's keyboard prediction and Apple's on-device Siri personalisation are canonical examples.
Cross-silo federated learning involves a smaller number of institutional participants — hospitals, banks, or government agencies — each holding large, carefully curated datasets. Participants are reliable and may jointly negotiate privacy guarantees. Applications include collaborative fraud detection models across financial institutions and joint clinical trial analysis across hospital networks.[^3]
Applications
Federated learning has found adoption across several data-sensitive sectors. In healthcare, hospitals train diagnostic models on patient data without pooling records, enabling collaborative research that would otherwise be blocked by patient privacy regulations. In financial services, banks jointly train fraud detection and anti-money laundering models across transaction data from multiple institutions without sharing customer records. In telecommunications, network operators optimise service quality models using device-level signal data while complying with telecommunications regulations. In autonomous vehicles, vehicles share model updates derived from driving encounters without transmitting raw sensor footage, improving perception models across an entire fleet.
See Also
References
References
- McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Agüera y Arcas, B. (2017). Communication-Efficient Learning of Deep Networks from Decentralized Data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS).
- Dwork, C., & Roth, A. (2014). The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4), 211–407.
- Kairouz, P., McMahan, H. B., Avent, B., et al. (2021). Advances and Open Problems in Federated Learning. Foundations and Trends in Machine Learning, 14(1–2), 1–210.