What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Gaussian Process

A non-parametric Bayesian model that defines a distribution over functions, widely used in regression, optimisation, and uncertainty quantification.

6 min readLast updated June 2026Foundations

A Gaussian process (GP) is a stochastic process in which any finite collection of random variables has a joint multivariate Gaussian distribution. Treated as a prior over functions, a GP provides a principled non-parametric Bayesian framework for regression and classification in which predictions come equipped with calibrated uncertainty estimates. GPs are fully specified by a mean function, which encodes prior beliefs about the average behaviour of the unknown function, and a covariance function, or kernel, which encodes assumptions about smoothness, periodicity, and length-scale. GPs are foundational tools in geostatistics, where they appear under the name kriging, and in modern Bayesian machine learning.

Mathematical formulation

A function f(x) is said to be drawn from a Gaussian process if, for any finite set of input points x_1 through x_N, the random vector of values f(x_1) through f(x_N) follows a multivariate Gaussian distribution. The distribution is determined by a mean function m(x) and a covariance kernel k(x, x') that returns the covariance between f(x) and f(x'). Conditioning the prior on observed training data yields a posterior that is again Gaussian and that gives, in closed form, the predictive mean and variance at any new input. The posterior mean interpolates the observations smoothly, and the predictive variance shrinks near observed points and grows in regions where the model is uncertain.

Covariance kernels

The kernel is the central modelling choice in a GP. The most common kernel is the squared exponential, also called the radial basis function, which produces infinitely differentiable sample paths and is parameterised by a length-scale and a signal variance. The Matérn family generalises the squared exponential and allows control over function smoothness via a parameter that interpolates between exponential and infinitely smooth kernels. Periodic kernels capture repeating structure, linear kernels recover Bayesian linear regression, and composite kernels formed by sums and products allow rich functional forms. Automatic relevance determination assigns a separate length-scale to each input dimension, providing a soft form of feature selection.

Inference and scalability

Exact GP inference requires inverting an N by N kernel matrix, where N is the number of training observations, giving cubic time and quadratic memory complexity. This becomes prohibitive beyond a few thousand points. A large body of approximate methods has been developed to extend GPs to larger datasets, including sparse approximations based on inducing points (such as the FITC and VFE methods), variational inference, structured kernel interpolation, and stochastic variational GPs that admit minibatch training. Deep kernel learning combines GPs with neural network feature extractors, and GPyTorch and GPJax provide GPU-accelerated implementations.

Bayesian optimisation

One of the most influential applications of GPs is Bayesian optimisation, in which a GP surrogate of an expensive black-box objective function is updated as evaluations are observed, and an acquisition function — such as expected improvement or upper confidence bound — selects the next query point by trading exploration against exploitation. Bayesian optimisation has become the standard method for hyperparameter tuning of machine learning models, materials discovery, A/B test design, and experimental optimisation in chemistry and biology. Frameworks such as BoTorch, GPyOpt, and Vizier implement GP-based Bayesian optimisation at scale.

Other applications

In geostatistics, GPs known as kriging models are used to interpolate spatial fields such as mineral concentrations, rainfall, and temperature. In robotics, GPs are used for inverse dynamics modelling and trajectory optimisation. In aerospace and engineering, GPs serve as surrogate models for expensive computer simulations. In epidemiology and finance, GPs provide flexible models for time series with calibrated uncertainty. GP classification, while less tractable than regression because of non-Gaussian likelihoods, is widely used in Bayesian deep learning research and in active learning settings.

Relationship to neural networks

Gaussian processes have a deep connection to neural networks. Radford Neal showed in the 1990s that an infinitely wide single-hidden-layer neural network with Gaussian-distributed weights converges to a GP, an observation later extended to infinitely wide deep networks under the neural network Gaussian process and neural tangent kernel theories. These results give GPs an important theoretical role in understanding the behaviour of large neural networks and in providing tractable analogues for analysis.

Malaysian Context — GPs in Local Research and Applied Science

Gaussian processes feature in Malaysian research across geostatistics, environmental science, and engineering. The Malaysian Meteorological Department (MetMalaysia) and academic groups at Universiti Putra Malaysia (UPM), Universiti Sains Malaysia (USM), and Universiti Teknologi Malaysia (UTM) have used kriging and GP regression for spatial interpolation of rainfall, air quality, and haze prediction — particularly relevant in the South-East Asian transboundary haze context. Petronas and its research arm Petronas Research Sdn Bhd have applied GP-based surrogate modelling in reservoir engineering, drilling optimisation, and process simulation, where evaluating high-fidelity simulators is expensive.

In agriculture, GP regression and Bayesian optimisation have been used by oil palm and rubber researchers, including teams at the Malaysian Palm Oil Board (MPOB) and the Malaysian Rubber Board, for yield modelling and trial design under the constraints of slow biological feedback. Malaysian universities also use Bayesian optimisation for hyperparameter tuning in their machine learning curricula, supported by HRD Corp claimable training programmes and AI courses delivered through MDEC-recognised digital hubs in Cyberjaya and TechCity Kuala Lumpur.

In finance, Bank Negara Malaysia (BNM) and individual banks such as Maybank, CIMB, RHB, and Hong Leong Bank have explored Bayesian and GP-based risk models for stress testing and credit scoring, although these methods compete with gradient boosting and deep learning in production. The Securities Commission Malaysia (SC) treats GP-based models within its general guidance on model risk management, requiring banks and capital market institutions to document assumptions, validate outputs, and manage data governance under the Personal Data Protection Act and the BNM Risk Management in Technology framework.

References

Rasmussen, C. E., and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press.
Neal, R. M. (1996). Bayesian Learning for Neural Networks. Springer.
Snoek, J., Larochelle, H., and Adams, R. P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. NeurIPS.
Hensman, J., Fusi, N., and Lawrence, N. D. (2013). Gaussian Processes for Big Data. UAI.

Tags:gaussian-process bayesian-machine-learning regression uncertainty-quantification

Type	Non-parametric Bayesian model
Defined by	Mean function and covariance kernel
Complexity	O(N^3) exact training, O(N^2) prediction
Key use	Regression, Bayesian optimisation, geostatistics
Related	Bayesian Inference, Kernel methods, Bayesian Neural Networks

Mathematical formulation

Covariance kernels

Inference and scalability

Bayesian optimisation

Other applications

Relationship to neural networks

See Also

References

References