What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Fine-Tuning

The process of further training a pre-trained machine learning model on a smaller, task-specific dataset to adapt its weights for a particular domain, task, or desired behaviour.

6 min readLast updated May 2026Applications

Fine-tuning is the process of taking a neural network model that has already been trained on a large, general dataset and continuing to train it on a smaller, more specific dataset to adapt its behaviour for a particular task or domain. The technique sits at the intersection of transfer learning and supervised training, and it has become the dominant paradigm for deploying large language models (LLMs) and other foundation models in production settings.

The core intuition is that a large pre-trained model has already learned general representations of language, vision, or other modalities at significant computational expense. Fine-tuning allows practitioners to leverage these representations without starting from scratch, dramatically reducing the data, time, and compute required to achieve strong performance on a new task.

Full Fine-Tuning

In full fine-tuning, all parameters of the pre-trained model are updated during training on the target dataset. This approach can yield the best task-specific performance because every weight is free to adapt. However, it is computationally intensive: fine-tuning a model with 70 billion parameters requires storing not only the model weights but also optimizer states and gradients, placing heavy demands on GPU memory.[^1] Full fine-tuning also risks catastrophic forgetting—the degradation of the model's general capabilities as it specialises. Regularisation strategies such as weight decay and learning rate warmup help mitigate this.

Parameter-Efficient Fine-Tuning (PEFT)

Parameter-efficient fine-tuning (PEFT) methods address the memory and compute costs of full fine-tuning by keeping most of the base model frozen and training only a small subset of parameters. IBM defines PEFT as a technique in which only a small portion of an LLM's parameters are selectively modified, adding new layers or modifying existing ones in a task-specific manner, with performance comparable to full fine-tuning at a fraction of the cost.[^2]

PEFT encompasses several families of methods, the most widely adopted being Low-Rank Adaptation (LoRA) and its variants.

LoRA (Low-Rank Adaptation)

LoRA injects trainable low-rank matrices into each transformer layer. The key insight is that the change to a model's weight matrix during fine-tuning tends to lie in a low-dimensional subspace—meaning it can be approximated by the product of two smaller matrices. For a weight matrix W of shape (d × k), LoRA adds a perturbation ΔW = BA, where B has shape (d × r) and A has shape (r × k), with rank r much smaller than d or k.[^3]

This reduces the number of trainable parameters by orders of magnitude. A full fine-tune of LLaMA 65B requires more than 780 GB of GPU memory; the same operation with QLoRA (a 4-bit quantised variant of LoRA) requires only 48 GB.[^4] LoRA adapters can be saved separately from the base model and swapped in at inference time, enabling a single hosted base model to serve many fine-tuned variants.

QLoRA

QLoRA extends LoRA by additionally quantising the frozen base model weights to 4-bit precision, substantially reducing the memory footprint of the base model itself. It introduces a new data type (NF4, or 4-bit NormalFloat) and double quantisation to minimise the quantisation error introduced by this compression.

Other PEFT Techniques

Prefix tuning prepends learnable virtual tokens to the input, effectively conditioning the model without modifying any weights. Adapter layers insert small bottleneck modules between transformer blocks. Prompt tuning optimises a small set of input tokens while leaving the full model frozen. Each approach involves different trade-offs between parameter count, convergence speed, and task performance.

Instruction Fine-Tuning

Instruction fine-tuning is a supervised fine-tuning variant in which the model is trained on a curated dataset of (instruction, response) pairs, teaching it to follow natural-language directives. OpenAI's InstructGPT (2022) and subsequent ChatGPT models are prominent examples; the process typically combines supervised fine-tuning with reinforcement learning from human feedback (RLHF) in a multi-stage pipeline.[^5]

Instruction fine-tuning is distinct from domain fine-tuning: the former shapes the model's behavioural style and ability to follow instructions, while the latter injects specialised knowledge.

Evaluation and Overfitting

A persistent risk in fine-tuning on small datasets is overfitting—the model memorises training examples rather than generalising. Common mitigation strategies include early stopping based on validation loss, data augmentation, learning rate scheduling, and mixing a small proportion of general-purpose data into the fine-tuning set to preserve breadth.

Malaysian Context — Fine-Tuning for Local Languages and Industries

Fine-tuning has attracted significant interest in Malaysia as organisations seek to adapt global foundation models to local linguistic and regulatory requirements. Bahasa Malaysia presents specific challenges: it is morphologically rich, uses many loanwords from Arabic, English, and regional languages, and is frequently code-switched with English in digital communication. Off-the-shelf English LLMs perform sub-optimally on Malay text, creating demand for domain-specific fine-tuning.

Agmo Group's Merdeka LLM project is one of the most visible examples, applying instruction fine-tuning to a Malay-language corpus to produce a model better aligned with Malaysian vocabulary, cultural context, and government terminology. Similarly, research groups at Universiti Malaya and Universiti Teknologi Malaysia (UTM) have published work on fine-tuning transformer models for Malay sentiment analysis, legal document summarisation, and medical text classification.

In the financial sector, Maybank and CIMB have leveraged fine-tuned models for fraud detection narrative generation, customer complaint classification, and automated regulatory report drafting. Malaysia Airlines uses a fine-tuned language model within its MHchat service to manage natural-language booking and rebooking requests.[^6] AirAsia integrates fine-tuned GPT-based models through Azure OpenAI Service in its "Ask Bo" assistant.

The Malaysia AI Governance Framework, published by the government, requires organisations deploying AI in high-risk domains—such as financial services, healthcare, and public administration—to document training data provenance and model adaptation methodology. Fine-tuning documentation therefore forms part of compliance evidence for organisations regulated by Bank Negara Malaysia (BNM) and the Securities Commission Malaysia (SC). HRD Corp has approved training grants for fine-tuning workshops, recognising it as a critical industry skill for Malaysian AI practitioners.

References

Databricks. (2024). Efficient Fine-Tuning with LoRA: A Guide to Optimal Parameter Selection for Large Language Models. https://www.databricks.com/blog/efficient-fine-tuning-lora-guide-llms
IBM. (2024). What is parameter-efficient fine-tuning (PEFT)? IBM Think. https://www.ibm.com/think/topics/parameter-efficient-fine-tuning
Hu, E., Shen, Y., Wallis, P., et al. (2022). LoRA: Low-Rank Adaptation of Large Language Models. ICLR 2022. arXiv:2106.09685.
Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023). QLoRA: Efficient Finetuning of Quantized LLMs. arXiv:2305.14314.
Ouyang, L., Wu, J., Jiang, X., et al. (2022). Training language models to follow instructions with human feedback. arXiv:2203.02155.
Fintechnews Malaysia. (2025). What Malaysian Banks Are Getting Right (and Wrong) About AI. https://fintechnews.my/52657/banking/what-malaysian-banks-are-getting-right-and-wrong-about-ai/

Tags:fine-tuning transfer-learning LoRA PEFT LLM

Type	Model adaptation technique
Prerequisite	Pre-trained base model
Common variants	Full fine-tuning, LoRA, QLoRA, PEFT
Key use	Domain adaptation, instruction following, style alignment
Related	Transfer learning, RLHF, LoRA, Large Language Models

Full Fine-Tuning

Parameter-Efficient Fine-Tuning (PEFT)

LoRA (Low-Rank Adaptation)

QLoRA

Other PEFT Techniques

Instruction Fine-Tuning

Evaluation and Overfitting

See Also

References

References