What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Continual Learning

Continual learning is a machine learning paradigm in which models incrementally acquire knowledge from sequential tasks or data streams without forgetting previously learned information, addressing the stability-plasticity trade-off inherent in neural networks.

7 min readLast updated June 2026Foundations

Continual learning is a subfield of machine learning concerned with training models that can acquire knowledge from a sequence of tasks or a non-stationary data stream, retaining previously learned capabilities while incorporating new information. The field directly addresses catastrophic forgetting, the well-documented tendency of neural networks to abruptly lose previously acquired knowledge when trained on new data. Continual learning is considered essential for building AI systems that can adapt to evolving environments without the expense and inefficiency of retraining from scratch on all accumulated data.

The Catastrophic Forgetting Problem

Standard neural network training assumes all training data is available simultaneously and drawn independently from a fixed distribution. When this assumption is violated — for example, when a model is first trained on task A and then fine-tuned on task B — the gradient updates for task B overwrite the weight configurations that encoded task A's knowledge. This phenomenon, termed catastrophic forgetting (also called catastrophic interference), was first described by McCloskey and Cohen in 1989 and remains a central challenge in neural network research.

Catastrophic forgetting arises from the distributed nature of neural representations: the same weights that encode knowledge about task A are also modified to encode task B, creating interference between the two. The severity depends on the similarity between tasks, the architecture of the network, and the learning rate used during adaptation.

The Stability-Plasticity Dilemma

Continual learning systems must navigate a fundamental tension known as the stability-plasticity dilemma. A plastic system adapts quickly to new data but risks overwriting existing knowledge. A stable system preserves existing knowledge but resists learning new patterns efficiently. Biological neural systems resolve this dilemma through mechanisms such as synaptic consolidation, complementary memory systems (the hippocampus for rapid encoding and the cortex for slow consolidation), and sleep-based memory replay. Artificial systems must find computational analogues to these biological strategies.

Continual Learning Scenarios

Researchers have defined three canonical scenarios to evaluate continual learning methods.

Task-incremental learning (Task-IL) is the simplest scenario: the system is given a task identifier at test time and must perform the correct task. The challenge is isolating task-specific knowledge.

Domain-incremental learning (Domain-IL) presents the same type of task across different data distributions (for example, classifying objects in different visual domains), without the task identifier at test time.

Class-incremental learning (Class-IL) is the most challenging scenario: the model must classify among all classes seen so far, without knowing which subset of classes the current input belongs to. New classes are added over time, and the model must not regress on earlier classes.

Approaches to Continual Learning

Regularisation-Based Methods

Regularisation approaches add penalty terms to the loss function that discourage changes to weights that were important for previous tasks. Elastic Weight Consolidation (EWC), proposed by Kirkpatrick et al. in 2017, estimates the importance of each weight using the Fisher information matrix and penalises deviations from earlier values proportionally to their importance. Synaptic Intelligence (SI) and Progressive Neural Networks are related approaches that track weight importance during training rather than computing it post-hoc.

Rehearsal-Based Methods

Rehearsal methods maintain a memory buffer containing a subset of examples from previous tasks and mix these stored examples with new training data to prevent forgetting. Experience Replay directly replays stored samples. Generative Replay uses a generative model trained on past tasks to synthesise pseudo-samples, avoiding the need to store real data (relevant for privacy-sensitive applications). Dark Experience Replay (DER) stores model logits rather than raw samples, preserving richer information at a similar memory cost.

Architecture-Based Methods

Architectural approaches allocate different parameters for different tasks, avoiding interference by construction. Progressive Neural Networks add new network columns for each new task and freeze previously learned columns, preserving prior knowledge at the cost of growing model size. PackNet and HAT (Hard Attention to the Task) use masks to identify and protect task-specific subnetworks within a fixed-capacity architecture.

Prompt-Based Methods

Recent work has adapted continual learning for pre-trained transformer models by learning task-specific prompt vectors while freezing the backbone model. Methods such as L2P (Learning to Prompt), DualPrompt, and CODA-Prompt achieve strong continual learning performance on image classification benchmarks by concentrating task-specific information in small prompt parameters, leaving the large pre-trained backbone untouched.

| Approach | Key Idea | Memory Overhead | Scalability | |---|---|---|---| | EWC | Penalise important weight changes | Low | Moderate | | Experience Replay | Store and replay past samples | Medium | Good | | Generative Replay | Synthesise past data | Model size | Good | | Progressive Networks | Separate columns per task | High (grows) | Limited | | Prompt-based | Task-specific prompts, frozen backbone | Very low | High |

Continual Learning for Large Language Models

With the rise of large language models, continual learning has taken on new importance. Deployed LLMs become stale as the world changes: new entities emerge, facts change, and user needs evolve. Continual learning for LLMs aims to update model knowledge incrementally without full retraining (which can cost millions of dollars). Continual fine-tuning with methods such as LoRA combined with regularisation or replay has shown promise for domain adaptation without catastrophic forgetting of general capabilities.

Malaysian Context — Adaptive AI Systems in Malaysia

Continual learning is relevant to Malaysian organisations deploying AI systems that must adapt to changing data distributions without incurring the cost and disruption of periodic full retraining. In the banking sector, Maybank and CIMB deploy fraud detection models that must continuously adapt to new fraud patterns. Traditional batch retraining introduces lag between the emergence of new fraud patterns and the model's ability to detect them; continual learning methods that update models incrementally from streaming transaction data reduce this lag.

In Malaysian healthcare, hospitals affiliated with the Ministry of Health Malaysia and private hospital groups such as KPJ Healthcare are exploring continual learning for clinical decision support systems that must adapt as new clinical evidence accumulates and patient population characteristics shift. The ability to update models on new clinical data without forgetting established medical knowledge is a key requirement in safety-critical medical AI applications.

MDEC's AI Centre of Excellence and Malaysian universities including Universiti Malaya and Universiti Teknologi Malaysia are conducting research on continual learning for Bahasa Malaysia natural language processing, addressing the challenge of updating language models with new vocabulary, slang, and domain terminology that emerges over time in Malaysian digital communications.

Malaysian manufacturing firms in the electrical and electronics sector, particularly in Penang's industrial ecosystem, are applying continual learning concepts to predictive maintenance AI systems. Production conditions change as equipment ages, materials vary, and production parameters are adjusted; models that can adapt to these shifts without retraining from scratch provide operational advantages in quality control and equipment uptime.

HRD Corp has funded AI upskilling programmes that include modules on adaptive machine learning, where continual learning techniques are presented as practical solutions to the model maintenance challenges Malaysian practitioners face in deploying production AI systems.

References

McCloskey, M., and Cohen, N. J. (1989). Catastrophic interference in connectionist networks: The sequential learning problem. Psychology of Learning and Motivation, 24, 109-165. Academic Press.
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., et al. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13), 3521-3526.
van de Ven, G. M., Tuytelaars, T., and Tolias, A. S. (2024). Continual Learning and Catastrophic Forgetting. arXiv:2403.05175.
Wang, Z., Zhang, Z., Lee, C. Y., et al. (2022). Learning to Prompt for Continual Learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Tags:lifelong-learning catastrophic-forgetting neural-networks incremental-learning

Also known as	Lifelong learning, incremental learning, sequential learning
Core challenge	Catastrophic forgetting
Key trade-off	Stability vs. plasticity
Related fields	Transfer learning, meta-learning, online learning
Active research area	Yes (2025)