What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Federated Learning

Federated learning is a machine learning paradigm in which a model is trained across multiple decentralised devices or servers holding local data, without exchanging the raw data itself, preserving privacy while enabling collaborative model improvement.

6 min readLast updated May 2026Foundations

Federated learning is a machine learning paradigm in which a shared global model is trained collaboratively across multiple devices or institutional nodes, each of which retains its own data locally. Rather than transmitting raw data to a central server, participants compute model updates (gradients or weight deltas) on their local data and share only these updates for aggregation. The central server combines the updates — typically through a weighted average — to improve the global model, which is then redistributed to participants. Raw personal or sensitive data never leaves the originating device or institution.

The paradigm was formally proposed by Google researchers McMahan and colleagues in a 2016 paper titled "Communication-Efficient Learning of Deep Networks from Decentralized Data," which also coined the term federated learning and introduced the Federated Averaging (FedAvg) algorithm.[^1] Google subsequently deployed federated learning for keyboard next-word prediction on Android devices, providing one of the first large-scale production implementations.

How Federated Learning Works

A standard federated learning round proceeds as follows. The central server initialises a global model and distributes it to a selected subset of participating clients. Each client trains the model locally on its own dataset for one or more gradient steps, then transmits the resulting model update back to the server. The server aggregates these updates — often as a weighted average proportional to each client's dataset size — to produce an improved global model. This cycle repeats across many rounds until the model converges.[^1]

This architecture aligns naturally with data protection principles including data minimisation and purpose limitation, as individual records never traverse the network. However, it introduces new engineering challenges: heterogeneous client hardware (ranging from smartphones to hospital servers), non-identically distributed data across clients (statistical heterogeneity), unreliable client availability, and the risk that model updates themselves may leak information about local data through membership inference or gradient inversion attacks.

Privacy-Enhancing Techniques

Several complementary methods strengthen the privacy guarantees of federated learning.

Differential privacy adds calibrated random noise to model updates before they are transmitted to the server, providing a formal mathematical bound on how much information about any individual training record can be inferred from the shared update.[^2]

Secure aggregation uses cryptographic protocols so that the server computes the aggregate of client updates without ever seeing individual clients' updates in plaintext, protecting against a curious or compromised server.

Homomorphic encryption allows computations to be performed directly on encrypted model updates, though at significant computational cost.

These techniques involve trade-offs: adding noise or encryption overhead typically reduces model accuracy or increases communication cost, and practitioners must calibrate the privacy-utility balance for each deployment context.

Cross-Device vs. Cross-Silo Federated Learning

Federated learning deployments fall into two broad categories.

Cross-device federated learning involves large numbers of mobile or IoT devices — potentially millions of smartphones — each holding relatively small personal datasets. Participants are ephemeral and unreliable; any given device may drop out mid-round due to connectivity loss or battery constraints. Google's keyboard prediction and Apple's on-device Siri personalisation are canonical examples.

Cross-silo federated learning involves a smaller number of institutional participants — hospitals, banks, or government agencies — each holding large, carefully curated datasets. Participants are reliable and may jointly negotiate privacy guarantees. Applications include collaborative fraud detection models across financial institutions and joint clinical trial analysis across hospital networks.[^3]

Applications

Federated learning has found adoption across several data-sensitive sectors. In healthcare, hospitals train diagnostic models on patient data without pooling records, enabling collaborative research that would otherwise be blocked by patient privacy regulations. In financial services, banks jointly train fraud detection and anti-money laundering models across transaction data from multiple institutions without sharing customer records. In telecommunications, network operators optimise service quality models using device-level signal data while complying with telecommunications regulations. In autonomous vehicles, vehicles share model updates derived from driving encounters without transmitting raw sensor footage, improving perception models across an entire fleet.

Malaysian Context — Federated Learning, PDPA, and Financial Services

Federated learning is gaining relevance in Malaysia as regulators and industry grapple with the tension between extracting AI value from sensitive data and complying with data protection obligations. Malaysia's Personal Data Protection Act 2010 (PDPA) — currently undergoing amendments to extend to the public sector and introduce new obligations — places restrictions on the transfer and processing of personal data, making federated learning an architecturally attractive solution for organisations that need to train AI models on user or patient data without centralising it.

Bank Negara Malaysia (BNM) has highlighted responsible AI and privacy-preserving technologies in its Financial Sector Blueprint and AI governance expectations for financial institutions. Malaysian banks including Maybank, CIMB, RHB, and Hong Leong Bank are investing in AI-powered credit scoring, fraud detection, and customer personalisation. Federated learning provides a pathway for these institutions to develop richer cross-institutional models — particularly for fraud detection where patterns span multiple banks — without running afoul of data protection obligations or creating systemic data breach risks.

Malaysia's Ministry of Health has identified collaborative AI model development as a priority for improving diagnostic capabilities in public hospitals, particularly for diseases with high national prevalence such as diabetes, hypertension, and tuberculosis. Federated learning enables Hospital Kuala Lumpur, Hospital Penang, and the network of district hospitals to train shared diagnostic models without transmitting patient records to a central server, supporting compliance with the Private Healthcare Facilities and Services Act 1998 and anticipated national health data governance regulations.

Research into federated learning in the Malaysian and Southeast Asian context is active at Universiti Malaya, Universiti Teknologi Malaysia, and Multimedia University (MMU). Malaysia's AI Roadmap and the MyDigital Blueprint identify privacy-preserving AI as a technology priority, and MDEC has facilitated cross-sector working groups exploring federated approaches in fintech and digital health. The Human Resources Development Corporation (HRD Corp) has begun to include federated learning modules in its approved AI upskilling programmes as demand grows among regulated industries.

Regional collaboration is also relevant: the ASEAN Data Management Framework and ASEAN Model AI Governance Framework acknowledge that cross-border AI training raises complex data localisation questions, and federated learning represents a technical means by which ASEAN member states might collaborate on shared AI models without requiring data to leave national jurisdictions.

References

McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Agüera y Arcas, B. (2017). Communication-Efficient Learning of Deep Networks from Decentralized Data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS).
Dwork, C., & Roth, A. (2014). The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, 9(3–4), 211–407.
Kairouz, P., McMahan, H. B., Avent, B., et al. (2021). Advances and Open Problems in Federated Learning. Foundations and Trends in Machine Learning, 14(1–2), 1–210.

Tags:privacy distributed learning machine learning data protection

Type	Distributed / privacy-preserving machine learning
Proposed by	Google (McMahan et al.)
Introduced	2016
Key benefit	Model training without centralising raw data
Privacy techniques	Differential privacy, secure aggregation, homomorphic encryption
Related	Edge AI, differential privacy, MLOps

How Federated Learning Works

Privacy-Enhancing Techniques

Cross-Device vs. Cross-Silo Federated Learning

Applications

See Also

References

References