AI Infrastructure, Safety & Ethics

Federated Learning

Definition

Federated learning (FL), introduced by Google in 2016, is a distributed ML paradigm where model training occurs locally across many participants (edge devices, hospitals, banks), with only gradient updates—never raw data—shared with a central coordinator. The coordinator aggregates these updates using federated averaging (FedAvg) to produce a global model that benefits from all participants' data without any single party seeing another's data. FL enables privacy-preserving collaborative learning across data silos that cannot legally or ethically share raw data: healthcare networks, financial institutions, telecom companies, and consumer devices.

Why It Matters

Federated learning solves the fundamental tension in collaborative AI: the model improves with more data, but data from multiple parties often cannot be centralized due to privacy regulations, competitive sensitivity, or technical constraints. GDPR's data minimization principle, HIPAA's data sharing restrictions, and competitive concerns about sharing proprietary customer data all create barriers to centralized ML. FL enables these parties to collaborate on model training while keeping their data local. For edge AI applications (smartphone keyboards, wearable health monitors), FL enables learning from millions of users without centralizing their personal data.

How It Works

A federated learning round: (1) the central server sends the current global model to selected participants; (2) each participant trains the model locally on their data for several epochs; (3) each participant computes the gradient update (difference between updated local model and global model); (4) participants send only the gradient updates (not raw data) to the server; (5) the server aggregates updates using FedAvg (weighted average by dataset size); (6) the server updates the global model and begins the next round. Communication efficiency optimizations (gradient compression, client sampling) reduce bandwidth. Combined with differential privacy, FL provides both distributed training and formal privacy guarantees.

Federated Learning — No Data Leaves the Device

Central Server

Aggregates model updates only

Device A

Trains locally

Sends ∇ gradients only

Device B

Trains locally

Sends ∇ gradients only

Device C

Trains locally

Sends ∇ gradients only

Raw data never shared — privacy preserved across distributed devices

Real-World Example

A consortium of 12 regional banks wanted to improve their shared fraud detection model but couldn't share transaction data across banks due to regulatory restrictions and competitive concerns. They implemented federated learning: each bank trained gradient updates on their local fraud data weekly and submitted noisy updates (epsilon=5 differential privacy) to a central aggregation service. The federated model trained on all 12 banks' patterns achieved 94% fraud detection rate vs. 87% for the best single-bank model—capturing cross-bank fraud patterns (fraudsters who systematically target multiple banks) that no individual bank could detect alone.

Common Mistakes

✕Assuming federated learning solves all privacy concerns—gradient updates can leak information about training data through gradient inversion attacks; differential privacy is needed for strong guarantees
✕Ignoring communication cost—each federated round requires sending model parameters to and from all participants; communication efficiency is a primary engineering challenge
✕Treating federated learning as plug-and-play—non-IID (non-independent and identically distributed) data across clients causes convergence issues requiring specialized FL algorithms

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

Federated Learning

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

Differential Privacy

Data Privacy

Responsible AI

Model Deployment

Transfer Learning

Ready to build your AI chatbot?