What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Support Vector Machine

A support vector machine (SVM) is a supervised machine learning algorithm that finds the optimal hyperplane separating data points of different classes by maximising the margin between the boundary and the nearest training examples.

7 min readLast updated May 2026Foundations

A support vector machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. Introduced by Vladimir Vapnik and colleagues at Bell Laboratories in the early 1990s, SVMs became one of the most widely applied machine learning algorithms in the decade preceding the deep learning era, and remain practically relevant today for tabular data problems, text classification, and scenarios where training data is limited.[^1] The core principle of an SVM is to identify the decision boundary — a hyperplane in the feature space — that most cleanly separates data points belonging to different classes, with the maximum possible margin between the boundary and the closest training examples from each class.

Geometric Intuition

To understand SVMs, it is helpful to begin with a two-dimensional example. Consider a set of data points in a plane, each coloured either red or blue, and the task of drawing a line that separates all red points from all blue points. Many such lines may exist. SVM selects the line that is furthest from the nearest red point and the nearest blue point simultaneously — that is, the line that maximises the margin between the two classes. The data points lying exactly on the margin boundaries are called support vectors, and they are the only training examples that determine the position of the decision boundary; all other training points are irrelevant once the boundary is found.[^2]

In a higher-dimensional feature space — with ten features rather than two — the decision boundary is a hyperplane (a flat subspace of dimension one less than the number of features). The maximum-margin hyperplane is found by solving a constrained quadratic optimisation problem. The mathematical formulation leads to a dual problem in which the solution depends only on dot products between pairs of training examples, a property that becomes critical when the kernel trick is applied.

The Kernel Trick

Many real-world datasets are not linearly separable — no hyperplane can cleanly divide the two classes in the original feature space. The kernel trick addresses this by implicitly projecting data into a higher-dimensional space where linear separation becomes possible, without explicitly computing the coordinates in that higher space. This is achieved by replacing the dot products in the SVM optimisation with a kernel function that computes similarity between pairs of points.[^3]

Common kernel functions include the Radial Basis Function (RBF, also known as the Gaussian kernel), the polynomial kernel, and the sigmoid kernel. The RBF kernel is the most widely used default because it introduces a notion of local similarity — points that are nearby in the original feature space have high kernel values — and produces smooth, non-linear decision boundaries. By selecting an appropriate kernel, SVMs can model highly complex class boundaries while retaining the maximum-margin optimality guarantee.

Soft Margin and Regularisation

The formulations described above assume that perfect separation is possible. In practice, real datasets contain noise and overlapping classes, and insisting on perfect separation leads to overfitting. The soft-margin SVM introduces a regularisation parameter C that controls the trade-off between maximising the margin and minimising training errors. A small value of C allows more training examples to fall on the wrong side of the margin (a wider, more tolerant margin), reducing overfitting at the cost of some training accuracy. A large value of C penalises misclassifications heavily, producing a narrower margin that fits the training data more tightly. Selecting the optimal C value is typically done via cross-validation.[^4]

SVM for Regression

The standard SVM was designed for binary classification, but its principles extend to regression tasks under the name Support Vector Regression (SVR). In SVR, the goal is to find a function that predicts continuous output values while remaining within a specified tolerance band around the training targets. Data points outside the band contribute to the loss; those inside do not. This produces regression models that are insensitive to small errors, potentially offering better generalisation than least-squares regression in noisy settings.

SVMs vs. Deep Learning

SVMs were the dominant algorithm for many classification tasks throughout the 1990s and 2000s, particularly for text categorisation, image recognition, and bioinformatics. With the rise of deep learning from 2012 onward, neural networks superseded SVMs on large-scale perceptual tasks involving raw images, audio, and text. However, SVMs retain advantages in settings with small or medium-sized tabular datasets, where deep learning models tend to overfit. SVMs have also proven valuable in high-stakes domains such as medical diagnosis, where the kernel trick provides a theoretically grounded way to handle feature spaces with hundreds of measured variables but only thousands of patient samples.

| Dimension | SVM | Deep Learning | |---|---|---| | Data requirements | Works well with limited data | Typically requires large datasets | | Interpretability | Moderate (support vectors identifiable) | Low (black box) | | Scalability | Slow on very large datasets | Scales well with data and compute | | Feature engineering | Often requires manual features | Learns features automatically | | Best use case | Tabular data, small datasets | Images, audio, text at scale |

Malaysian Context — SVM in Research and Industry

Support vector machines have been widely used in Malaysian academic and industry research across several domains. The Malaysian medical research community has applied SVMs to clinical prediction tasks including early detection of diseases such as diabetes and cardiovascular conditions in studies conducted at Hospital Kuala Lumpur and in collaboration with Universiti Malaya Medical Centre. These studies typically involve tabular patient data — blood test results, vital signs, and demographic variables — where SVM's ability to generalise from small, well-structured datasets is an advantage over deep learning.

In agriculture, Universiti Putra Malaysia (UPM) and the Malaysian Palm Oil Board (MPOB) have published research applying SVMs to palm oil quality grading, disease detection in oil palm plantations, and yield forecasting. These applications process hyperspectral imaging and sensor data, where SVMs with RBF kernels provide competitive accuracy with lower computational requirements than deep learning pipelines — important for deployment in rural plantation environments with limited compute infrastructure.

Malaysia's financial sector has used SVM-based models for credit scoring and fraud detection. CIMB and RHB have published research on hybrid SVM-ensemble models for retail credit risk assessment, where regulatory requirements for model explainability under BNM's risk-based guidelines favour algorithms with interpretable decision boundaries over opaque neural networks. The ability to identify and examine support vectors — the specific customer cases that define the classification boundary — provides a degree of auditability that satisfies internal model risk management standards.

The MDEC-supported AI talent pipeline through Malaysian universities includes SVM in foundational machine learning curricula across institutions including UTM, UKM, and UTAR, ensuring that Malaysian data scientists develop familiarity with classical algorithms before specialising in deep learning. HRD Corp has funded short courses covering SVM implementation using Scikit-learn, targeted at working professionals in manufacturing and financial services.

References

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
Burges, C. J. C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.
Scholkopf, B., & Smola, A. J. (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press.
Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2016). A Practical Guide to Support Vector Classification. National Taiwan University Technical Report.

Tags:support-vector-machine svm classification supervised-learning

Type	Supervised learning algorithm
Tasks	Classification, regression, anomaly detection
Introduced	1992–1995 (Vapnik et al.)
Key concept	Maximum-margin hyperplane, kernel trick
Related	Gradient boosting, random forest, neural network, kernel methods