Federated Aprendering: Definition & Meaning — AI Wiki

Uma abordagem de treinamento onde o modelo é treinado através de múltiplos dispositivos ou organizações sem compartilhar os dados brutos. Em vez de enviar dados a um servidor central, cada participante treina uma cópia local do modelo em seus próprios dados e envia só as atualizações do modelo (gradientes) a um coordenador central. O coordenador agrega atualizações de todos os participantes para melhorar o modelo global.

Por que importa

Federated learning habilita treinamento IA em dados que não podem ser centralizados devido a privacidade, regulação ou preocupações competitivas. Hospitais podem treinar colaborativamente um modelo diagnóstico sem compartilhar registros de pacientes. Empresas podem melhorar um modelo compartilhado sem expor dados proprietários. É a abordagem mais prática ao treinamento IA preservando privacidade em escala.

Deep Dive

The standard federated learning algorithm (FedAvg): (1) the server sends the current model to selected participants, (2) each participant trains the model on their local data for several steps, (3) participants send their updated model weights (not data) to the server, (4) the server averages the updates and creates a new global model, (5) repeat. The key property: raw data never leaves the participant's device.

Challenges

Non-IID data: participants often have very different data distributions (a hospital in Tokyo has different patient demographics than one in São Paulo). This makes training unstable — updates from different participants may conflict. Communication cost: sending model updates (potentially billions of parameters) over the network is expensive, especially for mobile devices. Free-riders: participants who receive the improved model but contribute low-quality updates. These challenges make federated learning harder than centralized training, though each has active solutions.

Real-World Use

Apple uses federated learning for keyboard prediction (learning from what you type without sending your texts to Apple). Google uses it for search suggestion improvement. Healthcare consortiums use it for multi-hospital model training. The technique is most valuable when: the data is truly sensitive (medical, financial), regulation prevents data sharing (GDPR, HIPAA), or the data is too large to centralize (billions of mobile device interactions).

Federated Aprendering

Por que importa

Deep Dive

Challenges

Real-World Use

Conceitos relacionados