What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Hyperparameter Tuning

The process of selecting optimal configuration values for a machine learning model's external parameters using methods such as grid search, random search, and Bayesian optimisation.

6 min readLast updated May 2026Infrastructure

Hyperparameter tuning, also called hyperparameter optimisation (HPO), is the process of choosing the values of a machine learning model's external configuration — its hyperparameters — to maximise performance on a held-out validation set. Hyperparameters differ from model parameters in that they are set before training rather than learned from data. Examples include the learning rate of a neural network, the depth and minimum-leaf-size of a decision tree, the number of attention heads in a transformer, and the regularisation strength of a logistic regression. Good hyperparameter choices often matter as much as the choice of model family itself, and HPO is now a routine step in every disciplined machine learning workflow.

Why tuning matters

A poorly tuned model can underperform a much simpler well-tuned model by a large margin. Conversely, the same architecture and dataset can produce widely different results depending on small changes in optimiser learning rate, batch size, or weight decay. Tuning provides a systematic, reproducible procedure for finding good configurations and quantifying how sensitive a model is to its hyperparameters, which is useful both for performance and for robustness assessment.

Main methods

Grid search

Grid search exhaustively evaluates every combination of values from a predefined finite set per hyperparameter. It is simple, embarrassingly parallel, and easy to reason about. Its cost grows multiplicatively with the number of hyperparameters, which makes it impractical beyond two or three dimensions. Grid search also wastes resources when only a few hyperparameters meaningfully affect performance, because it spends equal effort on irrelevant ones.

Random search

Random search samples configurations from specified distributions (uniform, log-uniform, categorical) rather than from a fixed grid. Bergstra and Bengio showed in 2012 that random search reaches comparable or better performance than grid search with far fewer evaluations when only a subset of hyperparameters truly matter — which is typically the case in deep learning. It is the recommended baseline for any tuning study.

Bayesian optimisation

Bayesian optimisation builds a probabilistic surrogate model — usually a Gaussian process or a tree-structured Parzen estimator (TPE) — of the objective function based on past observations. An acquisition function such as expected improvement balances exploration of uncertain regions with exploitation of promising ones, and selects the next configuration to evaluate. Bayesian methods typically converge in far fewer iterations than grid or random search, at the cost of more complex implementation and reduced parallelism. They are the default in libraries such as Optuna, Hyperopt, and BoTorch.

Hyperband and successive halving

Hyperband, introduced by Li and colleagues in 2017, exploits the fact that many bad configurations can be ruled out after only a few epochs. It allocates a small budget to many configurations, keeps the best fraction, and repeatedly doubles the budget while halving the population. BOHB combines Bayesian optimisation with Hyperband to inherit the strengths of both.

Evolutionary and population-based methods

Evolutionary algorithms maintain a population of configurations and apply mutation and selection. Population-Based Training (PBT), used at DeepMind for reinforcement learning, periodically copies and perturbs the hyperparameters of the best-performing members of the population in parallel. These methods are well suited to long training runs where hyperparameters might benefit from being changed during training rather than fixed up front.

Tooling

Open-source frameworks have matured significantly. Optuna and Ray Tune are widely used Python libraries that support distributed execution, pruning, and integration with common ML frameworks. Hyperopt remains popular for TPE-based search. KerasTuner is convenient for Keras users. Weights & Biases Sweeps and MLflow integrate tuning runs with broader experiment tracking. Cloud providers offer managed services including Vertex AI Vizier, Amazon SageMaker Automatic Model Tuning, and Azure ML's hyperparameter tuning.

Practical considerations

Tuning is bounded by compute budget. Practical workflows usually start with a small random search to understand sensitivity, then move to Bayesian methods or Hyperband within the promising region of the space. Search spaces should be specified on the appropriate scale — log-uniform for learning rates and regularisation strengths, integer for layer counts. Early stopping is essential: training every configuration to convergence is rarely affordable. Cross-validation should be used cautiously because it multiplies cost. For very large models such as foundation LLMs, full HPO is impractical and practitioners rely on well-documented community defaults, ablation on smaller proxies, and learning-rate range tests.

Malaysian Context — HPO in local ML practice

Hyperparameter tuning is taught as a standard part of data science curricula at Universiti Malaya, Universiti Sains Malaysia, Universiti Teknologi Malaysia, Multimedia University, and the data science academies funded under MDEC's MyDigital Workforce programme. HRD Corp claimable courses on applied machine learning consistently include modules on tuning with scikit-learn, Optuna, or Keras Tuner.

In industry, Malaysian banks including Maybank, CIMB, and RHB rely on disciplined HPO for credit-risk and fraud models because Bank Negara Malaysia's RMiT policy and model risk expectations require documented evidence that production models have been properly validated. Telecommunications operators such as Telekom Malaysia, Maxis, and CelcomDigi apply HPO to churn, propensity, and network optimisation models, typically on Databricks or self-hosted MLflow installations.

Compute cost is a meaningful constraint for many Malaysian teams. Cloud regions in Singapore and Kuala Lumpur (AWS Asia Pacific, Microsoft Azure Southeast Asia, Google Cloud Asia Southeast 1, and Alibaba Cloud Kuala Lumpur) all offer managed HPO services, but data residency considerations under the Personal Data Protection Act 2010 (PDPA) sometimes push teams toward on-premises clusters. AITG SDN BHD and other AWS Partner Network members frequently advise Malaysian customers on configuring SageMaker Automatic Model Tuning while keeping training data inside compliant regions.

For research, Malaysian university groups participating in MOSTI-funded programmes and the National AI Office Malaysia initiatives use Optuna and Ray Tune extensively for medical imaging, palm oil yield prediction, and traffic optimisation projects. The CoE-AI and several MDEC-supported start-ups have published comparisons of HPO methods on tropical-domain datasets.

Pitfalls

Common pitfalls include tuning on the test set rather than a separate validation set, ignoring random seed variance, choosing search spaces that are too narrow or too wide, and failing to report the search budget alongside the reported results. Reporting only the best run obscures how sensitive a method is to hyperparameter choice; reporting the distribution across trials gives a much more honest picture of robustness.

References

Bergstra, J., & Bengio, Y. (2012). Random Search for Hyper-Parameter Optimization. JMLR.
Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. NeurIPS.
Li, L. et al. (2017). Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. JMLR.
Akiba, T. et al. (2019). Optuna: A Next-Generation Hyperparameter Optimization Framework. KDD.
Bank Negara Malaysia. (2023). Risk Management in Technology (RMiT) Policy Document. bnm.gov.my.

Tags:hyperparameter-tuning optimisation automl bayesian-optimisation machine-learning

Type	Model selection and optimisation
Synonyms	Hyperparameter optimisation, HPO
Main methods	Grid search, random search, Bayesian optimisation, evolutionary, Hyperband, BOHB
Common libraries	Optuna, Ray Tune, Hyperopt, scikit-learn, KerasTuner
Key trade-off	Search cost vs. final model performance
Related field	AutoML and neural architecture search