What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Anomaly Detection

A class of machine learning techniques that identifies rare events, observations, or patterns that differ significantly from the majority of data, used for fraud, intrusion, and fault detection.

6 min readLast updated May 2026Applications

Anomaly detection, also called outlier detection or novelty detection, is the task of identifying data points, events, or observations that deviate significantly from the expected pattern of the majority. Anomalies are typically rare, costly, or indicative of an underlying issue — a fraudulent transaction, a failing machine, a network intrusion, or a medical abnormality. Because anomalies are by definition uncommon, labelled training data is usually scarce, which makes the task fundamentally different from standard supervised classification and shapes the choice of algorithm and evaluation metric.

Problem framing

Anomaly detection problems are commonly framed in one of three ways. Unsupervised detection assumes no labels and learns the structure of "normal" data from the bulk of the dataset, flagging points that do not fit. Semi-supervised detection is given a clean set of normal examples and trained to recognise deviations from them. Supervised detection assumes both normal and anomalous labels exist; it reduces to a heavily imbalanced classification problem, often addressed with resampling, cost-sensitive loss, or focal loss.

A second axis is whether anomalies are point (a single observation), contextual (anomalous only in a specific context such as time of day), or collective (a sequence or group of points that together are abnormal even if individually unremarkable).

Algorithms

Classical statistical methods rely on assumptions about the underlying distribution. Z-score, modified Z-score, and Mahalanobis distance flag points beyond a chosen threshold. These methods are interpretable and cheap but degrade quickly in high dimensions.

Distance- and density-based methods such as k-nearest neighbours (kNN), Local Outlier Factor (LOF), and DBSCAN exploit the geometry of the feature space. LOF, introduced by Breunig and colleagues in 2000, remains a standard baseline.

Isolation Forest, introduced by Liu and colleagues in 2008, builds an ensemble of random trees and scores points by how few splits are needed to isolate them. It scales well and is robust to high dimensionality.

One-class support vector machines (OC-SVM) and Support Vector Data Description (SVDD) construct a boundary around the normal data in feature space.

Deep learning approaches have become dominant for high-dimensional, structured, or sequential data. Autoencoders are trained to reconstruct normal inputs; high reconstruction error indicates anomaly. Variational autoencoders, GAN-based detectors, and transformer-based time-series models extend this idea. Graph neural networks are increasingly used for anomaly detection in financial transaction networks where the relational structure carries signal.

Evaluation

Because anomalies are rare, accuracy is misleading — a detector that always predicts "normal" can achieve 99.9 percent accuracy on a dataset with one anomaly in a thousand. The area under the precision-recall curve (AUPRC) and precision at K are typically more informative than the area under the ROC curve (AUROC). In production systems, alert volume and analyst workload are practical constraints: a detector that overwhelms analysts with false positives is unusable regardless of its theoretical score.

Applications

Financial fraud is the most economically significant application. Card-not-present fraud, account takeover, money laundering, and synthetic identity fraud are detected by ensembles that combine rule-based features, supervised classifiers, and unsupervised anomaly scores. According to industry surveys, around 90 percent of global banks use AI and machine learning in fraud prevention as of 2025.

Cybersecurity uses anomaly detection for intrusion detection (network and host), insider threat detection, and malware classification. The signal-to-noise ratio is challenging; modern systems pair anomaly scores with threat intelligence and behaviour analytics.

Predictive maintenance monitors vibration, temperature, current, and acoustic signatures of industrial equipment to predict failures before they occur. Petrochemical, power, and manufacturing operators use these systems to reduce unplanned downtime.

Healthcare applications include early warning scores for clinical deterioration, anomaly detection in ECG and EEG signals, and rare disease screening.

Malaysian Context — anomaly detection in banking, telco, and government

Anomaly detection is one of the most heavily deployed AI techniques in the Malaysian financial sector. Maybank, CIMB, Public Bank, RHB, and Hong Leong Bank all operate fraud-detection systems that combine supervised classifiers with unsupervised anomaly scores against card and online banking transactions. These systems must satisfy Bank Negara Malaysia (BNM) expectations under the Risk Management in Technology (RMiT) policy document and the BNM e-KYC and fraud frameworks. Real-time scoring is typically required to act within the few seconds before authorisation.

The rise of authorised push-payment fraud — particularly scam-driven transfers via DuitNow — has made anomaly detection a national-level concern. BNM and PayNet have introduced cooling-off rules and additional friction for high-risk transactions, and banks increasingly score the recipient account in addition to the sender's behaviour. The National Scam Response Centre (NSRC), a multi-agency initiative, coordinates rapid response between banks and the Royal Malaysia Police.

Telecommunications providers including Telekom Malaysia, Maxis, CelcomDigi, and U Mobile use anomaly detection for SIM swap fraud, international revenue share fraud (IRSF), and network anomaly detection. The Malaysian Communications and Multimedia Commission (MCMC) has rules around fraud reporting that drive investment in detection capability.

In cybersecurity, the National Cyber Security Agency (NACSA) and CyberSecurity Malaysia, together with the Cyber999 incident response centre, encourage critical national information infrastructure (CNII) operators to deploy anomaly-based detection in their security operations centres. The Cyber Security Act 2024 codified obligations for CNII operators that effectively require continuous monitoring.

Industrial operators such as Petronas, Tenaga Nasional Berhad (TNB), and the major palm oil mill operators apply anomaly detection for predictive maintenance and process safety. Training in these techniques is supported by HRD Corp and conducted at MDEC's MyDigital Workforce centres and university executive programmes in Penang, Cyberjaya, and Kuala Lumpur.

Limitations

Anomaly detection inherits all the challenges of operating on long-tail data. Concept drift — the gradual change of what "normal" looks like — requires continuous retraining. Adversarial actors actively shape their behaviour to look normal, particularly in fraud and intrusion contexts. Explainability is essential because false positives have real costs (a frozen account, a blocked transaction) but is difficult to produce for deep methods. Many production systems combine multiple detectors with human-in-the-loop review rather than relying on a single model.

References

Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly Detection: A Survey. ACM Computing Surveys.
Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: Identifying Density-Based Local Outliers. SIGMOD.
Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2008). Isolation Forest. ICDM.
Bank Negara Malaysia. (2023). Risk Management in Technology (RMiT) Policy Document. bnm.gov.my.
Feedzai. (2025). AI Trends in Fraud and Financial Crime Report 2025. feedzai.com.

Tags:anomaly-detection outlier-detection fraud-detection unsupervised-learning cybersecurity

Type	Machine learning task
Synonyms	Outlier detection, novelty detection
Common approaches	Statistical, distance-based, density-based, isolation, deep autoencoders, graph methods
Learning paradigm	Mostly unsupervised or semi-supervised; sometimes supervised
Key metrics	Precision at K, recall, AUROC, AUPRC
Typical use cases	Fraud detection, intrusion detection, predictive maintenance, healthcare monitoring