What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Transfer Learning

Transfer learning is a machine learning technique in which a model pre-trained on one task or dataset is adapted for a different but related task, enabling high performance with significantly less data and compute than training from scratch.

6 min readLast updated May 2026Foundations

Transfer learning is a training paradigm in machine learning where a model that has been pre-trained on a large, general-purpose dataset is subsequently adapted to perform well on a different but related task, typically with a smaller specialised dataset. Rather than initialising all model parameters randomly and learning from scratch, transfer learning starts from representations already acquired during pre-training, giving the model a head start that reduces both the volume of task-specific labelled data required and the total training time and compute expenditure. Transfer learning has become the dominant paradigm for applied machine learning in both computer vision and natural language processing, underpinning most modern AI products and research systems.

Conceptual Origins

The intuition underlying transfer learning draws on the observation that knowledge acquired in one context is often applicable in another. In humans, a person who knows one Romance language learns subsequent Romance languages far more quickly because of shared grammar, vocabulary, and structure. In neural networks, features learned to distinguish cats from dogs — curves, textures, part-whole relationships — are relevant to distinguishing other visual categories. This analogy motivated early work in transfer learning for neural networks, which demonstrated in the 2010s that features learned by CNNs on ImageNet transferred effectively to other image classification tasks with far less target-domain data.

The development of BERT (Bidirectional Encoder Representations from Transformers) in 2018 extended transfer learning decisively into natural language processing. Pre-training a transformer on large text corpora using self-supervised objectives produced representations so general that fine-tuning on a wide range of downstream NLP tasks — sentiment analysis, named entity recognition, question answering — consistently outperformed task-specific models trained from scratch. GPT and its successors demonstrated that auto-regressive language modelling pre-training transfers even more broadly, powering open-ended generation and in-context learning.

How Transfer Learning Works

Transfer learning involves two distinct phases. In the pre-training phase, a model is trained on a large, diverse dataset — ImageNet for vision, a multi-hundred-billion-token text corpus for language — using an objective that encourages the model to learn rich, generalisable representations. In the fine-tuning phase, the pre-trained model is updated on a smaller target dataset using a task-specific objective, adjusting the representations to the demands of the new domain.

The degree to which pre-trained weights are updated during fine-tuning varies. In feature extraction (sometimes called linear probing), all pre-trained layers are frozen and only a new output head is trained on top of the fixed representations; this requires very little target-domain data but cannot adapt the model's internal representations to domain-specific patterns. Full fine-tuning updates all layers and achieves the best task performance but requires more data and compute. Intermediate approaches — freezing early layers and fine-tuning later ones — balance the two extremes.

Domain Adaptation and Negative Transfer

When the source domain (where the model was pre-trained) and the target domain (the application task) are very similar, transfer is highly effective. As the domains diverge, the benefit of transfer diminishes. When domains are sufficiently different, transfer learning can occasionally hurt performance compared to random initialisation — a phenomenon called negative transfer. In practice, negative transfer is rare with large pre-trained models because the scale and diversity of pre-training data tends to produce representations that generalise broadly.

Domain adaptation methods specifically address the case where labelled target-domain data is scarce. Continued pre-training on unlabelled target-domain text before fine-tuning often improves downstream performance on specialised tasks such as biomedical NLP (PubMedBERT), legal text analysis (Legal-BERT), or financial document understanding.

Applications

In computer vision, transfer learning enables the rapid deployment of image classification, object detection, and medical image analysis systems with as few as hundreds of labelled examples, by fine-tuning models pre-trained on ImageNet. ResNet, EfficientNet, and Vision Transformer (ViT) checkpoints are the standard starting points.

In NLP, virtually all production language models — including the Claude, GPT, and Gemini families — rely on transfer learning at their core: a general pre-trained model is fine-tuned for instruction following or specific domains using RLHF, supervised fine-tuning, or parameter-efficient methods such as LoRA. Speech recognition systems transfer acoustic representations trained on large multilingual corpora to low-resource language models.

In tabular and structured data domains, transfer learning is less mature but growing: models pre-trained on diverse tabular datasets have been shown to transfer effectively to downstream prediction tasks.

Malaysian Context — Transfer Learning in Malaysian AI Applications

Transfer learning is the de facto approach for AI development in Malaysia across virtually every sector, because training large models from scratch requires computational resources and data volumes that are beyond the reach of most organisations. Instead, Malaysian enterprises and researchers routinely fine-tune pre-trained models on local data, achieving strong task performance at a fraction of the cost of ground-up training.

In Malaysian healthcare, transfer learning has been applied to medical imaging applications where labelled data is expensive to acquire. Research groups at Universiti Malaya Medical Centre (UMMC), Hospital Kuala Lumpur, and Malaysian university hospitals have published work on transfer learning for chest X-ray analysis, diabetic retinopathy screening, and histopathology slide classification, demonstrating that ImageNet-pre-trained CNNs and vision transformers achieve competitive diagnostic performance with datasets of a few thousand Malaysian patient images.

Malaysia's banking and financial services sector uses transfer-learned NLP models for a wide range of applications including customer complaint classification, Bahasa Malaysia regulatory document analysis, and anti-money laundering transaction monitoring. Institutions such as Maybank and CIMB have built in-house data science teams that maintain fine-tuned BERT and multilingual transformer checkpoints trained on financial and regulatory corpora in both Bahasa Malaysia and English. Bank Negara Malaysia's AI governance guidance requires institutions to validate that fine-tuned models perform adequately on local data distributions — an implicit recognition that models pre-trained on predominantly English or Western data may need substantial fine-tuning for Malaysian contexts.

MDEC's AI Roadmap and the National AI Office Malaysia's initiatives recognise transfer learning as central to Malaysia's AI capability development strategy. Rather than pursuing independent foundation model pre-training — which would require investments comparable to those of major US and Chinese laboratories — the strategy centres on building Malaysian expertise in fine-tuning, domain adaptation, and responsible deployment of international foundation models. HRD Corp-certified training programmes delivered by Malaysian AI education providers increasingly cover transfer learning workflows using popular frameworks such as Hugging Face Transformers, reflecting employer demand for practitioners skilled in adapting pre-trained models to local datasets and use cases.

References

Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT 2019.
Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks?. NeurIPS 2014.
Gururangan, S., Marasović, A., Swayamdipta, S., et al. (2020). Don't stop pretraining: Adapt language models to domains and tasks. ACL 2020.
IBM. (2025). What is transfer learning?. IBM Think. https://www.ibm.com/think/topics/transfer-learning
Malaysia Digital Economy Corporation. (2024). Malaysia AI Governance Framework. MDEC, Putrajaya.

Tags:transfer learning pre-training fine-tuning domain adaptation

Type	Machine learning training paradigm
Key mechanism	Reuse of pre-trained model weights for a new task
Approaches	Feature extraction, Fine-tuning, Domain adaptation
Enabled by	Large-scale pre-training (BERT, ResNet, GPT, etc.)
Related	Fine-tuning, LoRA, Few-shot learning, Deep learning