Federated 学习ing: Definition & Meaning — AI Wiki

一种在多个设备或组织间训练模型而不共享原始数据的训练方法。不是把数据发到中心服务器,而是每个参与者在自己的数据上训练本地模型副本,只把模型更新(梯度)发到中央协调者。协调者聚合所有参与者的更新来改进全局模型。

为什么重要

联邦学习启用在因隐私、监管、或竞争顾虑而无法中心化的数据上做 AI 训练。医院可以协作训练诊断模型而不共享病人记录。公司可以改进共享模型而不暴露专有数据。它是规模化保隐私 AI 训练最实用的方法。

Deep Dive

The standard federated learning algorithm (FedAvg): (1) the server sends the current model to selected participants, (2) each participant trains the model on their local data for several steps, (3) participants send their updated model weights (not data) to the server, (4) the server averages the updates and creates a new global model, (5) repeat. The key property: raw data never leaves the participant's device.

Challenges

Non-IID data: participants often have very different data distributions (a hospital in Tokyo has different patient demographics than one in São Paulo). This makes training unstable — updates from different participants may conflict. Communication cost: sending model updates (potentially billions of parameters) over the network is expensive, especially for mobile devices. Free-riders: participants who receive the improved model but contribute low-quality updates. These challenges make federated learning harder than centralized training, though each has active solutions.

Real-World Use

Apple uses federated learning for keyboard prediction (learning from what you type without sending your texts to Apple). Google uses it for search suggestion improvement. Healthcare consortiums use it for multi-hospital model training. The technique is most valuable when: the data is truly sensitive (medical, financial), regulation prevents data sharing (GDPR, HIPAA), or the data is too large to centralize (billions of mobile device interactions).

Federated 学习ing

为什么重要

Deep Dive

Challenges

Real-World Use

相关概念