AIWiki
Malaysia

All Articles

29 articles in this section

Infrastructure

AutoML

AutoML (Automated Machine Learning) is the process of automating the selection, composition, and tuning of machine learning algorithms and pipelines, enabling practitioners to build effective models with reduced manual effort.

6 min readUpdated May 2026
Infrastructure

Core ML

Core ML is Apple's on-device machine learning framework that enables iOS, macOS, watchOS, and tvOS applications to integrate pre-trained models for tasks including image classification, natural language processing, and sound analysis.

5 min readUpdated June 2026
Infrastructure

CUDA

NVIDIA's parallel computing platform and programming model that lets developers use GPUs for general-purpose computation, underpinning most modern deep learning frameworks.

4 min readUpdated May 2026
Infrastructure

Data Augmentation

A set of techniques that expand a training dataset by creating modified copies of existing examples, helping deep learning models generalise better and reducing overfitting.

4 min readUpdated May 2026
Infrastructure

Data Labelling

Data labelling is the process of attaching meaningful tags, classes, or annotations to raw data so that supervised machine learning models can learn to predict those labels on unseen examples.

5 min readUpdated June 2026
Infrastructure

Data Pipeline

A data pipeline is an automated sequence of processes that ingests, transforms, and delivers data from source systems to destination systems for analysis, machine learning, or operational use.

6 min readUpdated June 2026
Infrastructure

DataOps

DataOps is an engineering methodology that applies agile, DevOps, and lean manufacturing principles to data pipelines, aiming for rapid, reliable, and repeatable delivery of analytics and machine learning data.

4 min readUpdated June 2026
Infrastructure

Edge AI

Edge AI is the deployment of artificial intelligence algorithms and inference workloads directly on local devices or edge computing nodes rather than in centralised cloud data centres, enabling low-latency, privacy-preserving, and bandwidth-efficient AI applications.

7 min readUpdated May 2026
Infrastructure

Feature Store

A centralised data platform for storing, serving, and managing machine learning features so that they can be reused consistently across training and online inference.

5 min readUpdated May 2026
Infrastructure

Hyperparameter Tuning

The process of selecting optimal configuration values for a machine learning model's external parameters using methods such as grid search, random search, and Bayesian optimisation.

6 min readUpdated May 2026
Infrastructure

Inference (Machine Learning)

Inference is the phase in which a trained machine learning model is used to generate predictions or outputs from new input data, distinct from the earlier training phase.

5 min readUpdated May 2026
Infrastructure

Knowledge Distillation

Knowledge distillation is a model compression technique in which a smaller student neural network is trained to replicate the behaviour of a larger, more capable teacher model, enabling deployment of efficient models that approximate teacher-level performance.

6 min readUpdated May 2026
Infrastructure

LangChain

LangChain is an open-source framework for building applications powered by large language models, providing composable abstractions for chaining LLM calls with tools, memory, and data retrieval in Python and JavaScript.

6 min readUpdated May 2026
Infrastructure

Langfuse

Langfuse is an open-source LLM engineering platform that provides observability, tracing, prompt management, evaluation, and dataset tooling for teams building applications on top of large language models.

6 min readUpdated June 2026
Infrastructure

MLflow

An open-source platform for managing the end-to-end machine learning lifecycle, including experiment tracking, model packaging, a model registry, and deployment.

5 min readUpdated May 2026
Infrastructure

MLOps

A set of practices and tools that combine machine learning, DevOps, and data engineering to automate and operationalise the full lifecycle of ML models from development through production deployment and monitoring.

7 min readUpdated May 2026
Infrastructure

Model Compression

Model compression is a set of techniques that reduce the size, memory footprint, and computational cost of machine learning models while preserving predictive accuracy, enabling deployment on resource-constrained hardware.

6 min readUpdated June 2026
Infrastructure

Model Pruning

A model compression technique that removes redundant or low-importance parameters from a neural network to reduce size, memory footprint, and inference latency while preserving accuracy.

6 min readUpdated June 2026
Infrastructure

Model Registry

A model registry is a centralised system that catalogues, versions, and governs trained machine learning models throughout their lifecycle, supporting reproducibility, deployment, and compliance.

5 min readUpdated June 2026
Infrastructure

Model Serving

Model serving is the discipline of deploying trained machine learning models behind APIs or runtimes so that production applications can request predictions at scale with predictable latency, throughput, and reliability.

5 min readUpdated May 2026
Infrastructure

Neural Architecture Search

Neural architecture search is the automated design of neural network architectures using search algorithms, reinforcement learning, or gradient-based methods to discover models that meet target accuracy, latency, and size constraints.

5 min readUpdated May 2026
Infrastructure

ONNX (Open Neural Network Exchange)

An open standard format for representing machine learning models that enables interoperability between deep learning frameworks, runtimes, and hardware platforms.

5 min readUpdated May 2026
Infrastructure

OpenVINO

OpenVINO is an open-source toolkit developed by Intel for optimising and deploying deep learning inference across Intel hardware, including CPUs, GPUs, Neural Processing Units, and FPGAs, with broad support for major AI frameworks and model formats.

6 min readUpdated June 2026
Infrastructure

Parameter-Efficient Fine-Tuning

A family of techniques that adapts a pretrained language or vision model to a downstream task by training only a small fraction of its parameters, dramatically reducing compute, memory, and storage requirements compared to full fine-tuning.

5 min readUpdated May 2026
Infrastructure

Quantisation

Quantisation is a model compression technique that reduces the numerical precision of a neural network's weights and activations from high-bit floating-point formats to lower-bit representations, decreasing memory usage and accelerating inference with minimal accuracy loss.

7 min readUpdated May 2026
Infrastructure

Synthetic Data

Synthetic data is artificially generated data that mimics the statistical properties of real datasets, created using generative AI or simulations to train machine learning models without exposing sensitive personal information.

6 min readUpdated May 2026
Infrastructure

Tensor Processing Unit

A tensor processing unit (TPU) is a custom application-specific integrated circuit developed by Google for accelerating machine learning workloads, particularly neural network training and inference.

4 min readUpdated May 2026
Infrastructure

TensorFlow Lite

TensorFlow Lite is an open-source deep learning framework from Google for running optimised machine learning models on mobile phones, microcontrollers, and other edge devices.

5 min readUpdated June 2026
Infrastructure

Vector Database

A specialised database system that stores data as high-dimensional numerical vectors and enables fast approximate nearest-neighbour search, forming the retrieval backbone of semantic search and RAG systems.

7 min readUpdated May 2026